Next Article in Journal
Ventilation and Infection Control in Healthcare Facilities: A Post-COVID-19 Literature Synthesis
Previous Article in Journal
Innovation in Indoor Disinfection Technologies During COVID-19: A Comprehensive Patent and Market Analysis (2020–2025)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Toward Personalized Short-Term PM2.5 Forecasting Integrating a Low-Cost Wearable Device and an Attention-Based LSTM

by
Christos Mountzouris
*,
Grigorios Protopsaltis
and
John Gialelis
Department of Electrical and Computer Engineering, University of Patras, Rion, 26 504 Patras, Greece
*
Author to whom correspondence should be addressed.
Submission received: 11 September 2025 / Revised: 22 October 2025 / Accepted: 24 October 2025 / Published: 1 November 2025

Abstract

Exposure to degraded indoor air quality (IAQ) conditions represents a major concern for health and well-being. PM2.5 is among the most prevalent indoor air pollutants and constitutes a key indicator in IAQ assessment. Conventional IAQ frameworks often neglect personalization, which in turn compromises the reliability of exposure estimation and the interpretation of associated health implications. In response to this limitation, the present study introduces a human-centric framework that couples wearable sensing with deep learning, employing a low-cost wearable device to capture PM2.5 concentrations in the immediate human vicinity and an attention-based Long-Short Term Memory (LSTM) to deliver 5-min-ahead exposure predictions. During evaluation, the proposed framework demonstrated strong and consistent performance across both stable conditions and transient spikes in PM2.5, yielding a Mean Absolute Error (MAE) of 0.181 µg/m3. These findings highlighted the synergistic potential between wearable sensing and data-driven modeling in advancing personalized IAQ forecasting, informing proactive IAQ management strategies, and ultimately promoting healthier built environments.

1. Introduction

Nowadays, with people spending up to 90% of their time indoors, exposure to compromised indoor air quality (IAQ) has emerged as a major concern for public health. In 2019 alone, World Health Organization (WHO) estimated that household air pollution was responsible for nearly 86 million lost healthy life years, with the greatest burden affecting women in low-income and middle-income countries [1]. In the same year, WHO attributed more than 7 million annual deaths, including 3.2 million premature deaths, to exposure to poor IAQ conditions, among which 237,000 occurred in children under the age of 5. With more than 750 million people experiencing energy poverty worldwide, widespread reliance on polluting fuels for cooking and heating remains prevalent, further compromising IAQ in vulnerable households. Therefore, improving air quality has been prioritized in the action plans and policy frameworks of the European Union and the United Nations, aligning with the broader objectives of sustainable development, climate resilience, and health equity [2,3].
Indoor air is degraded by numerous pollutants, which are broadly classified as chemical contaminants and biological agents. Chemical contaminants encompass gaseous pollutants—such as carbon oxides (COx), nitrogen oxides (NOx), ozone (O3), and volatile organic compounds (VOCs)—as well as particulate matter (PM). PM represents a complex mixture of solid and liquid particles suspended in air, categorized by aerodynamic diameter into fractions such as PM1, PM2.5, and PM10. A systematic review of 141 IAQ studies conducted across 29 countries revealed that PM2.5 was the predominant research focus, being investigated in 73 of these studies [4]. This focus is not incidental; nearly all established environmental standards have identified PM2.5 as a critical indicator in IAQ assessment, recognizing its causal links to adverse health outcomes. Of particular concern, a range of respiratory diseases have been associated with PM2.5 including lung cancer [5], increased susceptibility to respiratory infections such as influenza [6], and cardiovascular diseases such as hypertension [7,8].
To mitigate these health risks, health organizations, environmental regulatory bodies, and scientific institutions have established standards and guidelines that define the acceptable concentration limits for prevalent air pollutants based on exposure duration. Among the existing European standards and guidelines, the most commonly referenced short-term exposure limits for PM2.5 are 40 µg/m3 for the 8 h mean and 25 µg/m3 for the 24 h mean, while the stipulated annual limit is set at 15 µg/m3 [9]. The WHO guidelines propose more health-protective and stringent limits for PM2.5, recommending mean concentrations of 15 µg/m3 over 24 h and 5 µg/m3 annually. Alarmingly, WHO has reported that nearly 99% of the global population remains exposed to pollutant concentrations exceeding these thresholds [10].
Recent advances in sensing technologies have helped bridge the long-standing gap between IAQ policies and their practical implementation, enabling accurate yet cost-effective monitoring of air pollutants. The rise of wearable and portable sensors has accelerated the paradigm shift toward personalized IAQ monitoring, marking a transition from spatially averaged measurements to characterization of contamination dynamics within the human microenvironment. Harnessing lightweight communication protocols, edge computing, and data analytics, sensors have evolved into the backbone of modern, smart IAQ ecosystems, facilitating an integrated, end-to-end approach to air quality management and predictive modeling [11]. Nevertheless, limitations in spatial coverage and multi-sensing capabilities still constrain their scalability in real-world applications. In response, virtual sensing has gained prominence, reliably estimating air pollutants concentrations through sparse sensing networks using advanced spatio-temporal interpolation methods and data fusion techniques [12].
Machine Learning (ML) and Artificial Intelligence (AI) are emerging as pivotal enablers of IAQ modeling, driving a transition from reactive monitoring systems toward data-driven, adaptive solutions [13]. Beyond forecasting, AI/ML could be instrumental in improving sensors’ accuracy and reliability through low-cost calibration [14,15,16] and expanding their spatial coverage through virtual sensing [17]. The advent of Large Language Models (LLMs) has revolutionized the AI landscape, spurring significant advances in natural language understanding and human–machine interaction. Increasingly, these models are being explored as tools for integrating conversational interfaces with analytical and forecasting workflows in IAQ applications [18]. By doing so, LLMs foster the democratization of air quality knowledge by providing accessible insights on air pollution, its associated health implications, and effective ventilation strategies [19].
The main research objective of this work was to introduce a human-centric framework aimed at forecasting PM2.5 5 min ahead at the human immediate proxies. Integral to this framework is a low-cost, arm-worn wearable device that enables precise estimation of immediate PM2.5 exposure in a minimally intrusive manner. Such a human-centric approach eliminates the need to interpolate room-level measurements to human-proximal conditions, facilitating personalized exposure estimation, health risk assessment, and proactive air-filtration management.
The remainder of this paper is articulated as follows. Section 2 reviews prior studies in IAQ domain with relevant research objectives. Section 3 describes the wearable device and the data collected, explores its main distributional characteristics and dependencies, and presents the attention-based LSTM architecture. Section 4 evaluates the proposed framework’s predictive performance. Finally, Section 5 discusses the research findings.

2. Related Works

The robust performance of deep learning architectures in PM2.5 forecasting has been consistently substantiated across several studies, with LSTM emerging as one of the principal focal points in recent research. Integrated LSTM and Convolutional Neural Networks (CNN) approaches have also received considerable research attention owing to their complementary strengths in modeling both temporal and spatial dependencies in PM2.5 dynamics. Integrated LSTM and Bayesian Neural Network (BNN) approaches have emerged as effective modeling paradigms to quantify uncertainty and support probabilistic decision-making in IAQ domain. Ensemble empirical mode decomposition (EEMD) has proven to be a prominent technique for decomposing complex, non-stationary time-series into interpretable temporal modes, enhancing the learning efficacy of LSTM models. The remainder of this section outlines methodological approaches proposed for PM2.5 prediction, and reports their performance in terms of Mean Absolute Error (MAE), Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE).
Hybrid CNN-LSTM approaches for PM2.5 forecasting have been investigated in three studies targeting a 1 h ahead horizon. Huang and Kuo [20] proposed a CNN-LSTM multilayer structure with augmented attention mechanism termed Attention-based Parallel Networks (APNet), where a 1D-CNN extracts local spatial structures in PM2.5 concentrations and an LSTM captures their temporal correlations. Using multivariate input sequences comprising cumulative PM2.5 levels, wind speed, and rainfall duration from the preceding day, APNet achieved a MAE of 14.63 µg/m3 and an RMSE of 34.23 µg/m3, outperforming standalone CNN and LSTM configurations. Zaini et al. [21] developed a hybrid CNN-LSTM model, leveraging a two-year dataset with outdoor air quality measurements from two monitoring stations in Malaysia. Although the hybrid model exhibited superior performance compared to standalone CNN and LSTM configurations, applying EEMD further enhanced the accuracy of the standalone LSTM, yielding a MAE of 2.8 µg/m3 and an RMSE of 4.9 µg/m3. Under a similar experimental design involving two outdoor air quality monitoring stations in Beijing, Bai et al. [22] implemented an LSTM preceded by EEMD-based time-series decomposition. Their model reported comparable accuracy metrics for the two monitoring sites: RMSE of 14.0 µg/m3 and 12.1 µg/m3, MAPE of 19.6% and 16.9%, and R2 of 0.994 and 0.991. Chae et al. [23] developed an interpolated CNN designed to predict PM2.5 concentrations in unmonitored areas, yielding an R2 of 0.97 with an RMSE accounting for 16% of the concentrations’ variability.
In contexts emphasizing temporal rather than spatial PM2.5 variability, Recurrent Neural Networks (RNNs) have emerged as particular effective approaches. Prihatno et al. [24] demonstrated the superior performance of bidirectional RNN architectures over unidirectional LSTM, CNN-LSTM, and Transformers in PM2.5 forecasting. The bidirectional LSTM yielded comparable MAE of 2.65 µg/m3 and 2.67 µg/m3 for 5 min and 30 min ahead, respectively, and 3.95 µg/m3 for 1 h ahead. X. Dai et al. [25] employed an RNN to enable automated control actions in residential ventilation systems, combining an autoencoder with a recurrent processing layer. For 30 min ahead, it achieved MAE values of 8.3 µg/m3 and 9.2 µg/m3 for PM2.5 levels below and above 50 µg/m3, respectively. P. Karaiskos et al. [26] applied an LSTM model to predict future time intervals expected to satisfy the threshold values stipulated in IAQ standards and guidelines, thereby assessing periods of healthy environment. Conducting experiments in a naturally ventilated fitness center in Texas, the model achieved an 85.7% detection rate. Lagesse et al. [27] argued that LSTM models could serve as viable alternatives to costly monitoring devices, enabling PM2.5 forecasting using data from open, low-cost, and high-end sensors. With 5 min granularity, LSTM achieved normalized RMSE of 0.024 µg/m3, nearly matching the precision observed when using expensive nephelometers instrumentation, thereby reducing the cost barrier for building managers seeking actionable IAQ insights.
Beyond the IAQ domain, gradient boosting algorithms have shown notable performance in time-series forecasting, effectively capturing complex non-linear relationships among temporal features; however, their practical implementation in real-world IAQ application remains limited. A. Singh et al. [28] developed an XGBoost to forecast PM2.5 concentrations one-hour ahead attaining a MAPE of 0.040% and an RMSE < 10−3 µg/m3 that outperformed deep learning architectures including LSTM.
BNN architectures aimed at predicting indoor PM2.5 concentrations were explored in two studies. First, H. Dai et al. [29] employed a BNN leveraging an extensive IAQ dataset comprising more than 1.4 million records to predict average day-ahead PM2.5 levels across residential settings in China. This model incorporated socioeconomic factors as additional predictors to enhance its accuracy, achieving a MAE of 9.45 µg/m3 and an RMSE of 13.3 µg/m3. Utama et al. [30] proposed an integrated BNN-LSTM architecture to forecast PM2.5 at 1-, 2-, and 3 h horizons, yielding MAE values of 0.90 µg/m3, 2.94 µg/m3, and 3.31 µg/m3 for the respective horizons, outperforming LSTM and CNN-LSTM configurations.
Z. Zhang and S. Zhang [31] introduced a novel Sparse Transformer Network (STN) with a multi-head attention-mechanism in both encoder and decoder sides aimed at minimizing computational overhead and processing time. Evaluated against CNN, LSTM, and Transformer architectures at two locations in China over 1, 6, 12, and 24 h horizons, it exhibited an optimal MAE of 3.76 µg/m3 for Taizhou and 11.13 µg/m3 for Beijing. Mahajan et al. [32] implemented a univariate time-series model using exponential smoothing with drift (ESD) leveraging data from 132 IoT-based air quality monitoring stations in Taiwan. The proposed model outperformed traditional time-series models and DL architectures, attaining a MAE of 0.16 µg/m3.

3. Methods and Materials

Section 3 outlines the materials and methods underpinning the proposed predictive framework for estimating personalized PM2.5 exposure. Section 3.1 presents the low-cost wearable device used to collected PM2.5 measurements from the human immediate proxies. Section 3.2 examines the distributional characteristics underlying the collected data. Section 3.3 explores temporal patterns and dependencies in PM2.5 measurements. Section 3.4 describes the architecture of the attention-based LSTM models. Section 3.5 defines the metrics used to evaluate their predictive performance.

3.1. Wearable Device and Data Collection

The low-cost wearable device deployed to monitor an individual’s direct exposure to PM2.5 integrates a Sensirion SPS30 optical particle counter based on laser-scattering principles [33]. This sensor detects particles within the 0.3–10 µm aerodynamic diameter range and computes cumulative mass concentrations for PM1, PM2.5, PM4, and PM10. It operates within a temperature range of −10 °C and 60 °C and a relative humidity range of 0% and 95% (non-condensing), and provides digital outputs through Universal Asynchronous Receiver/Transmitter (UART) and Inter-Integrated Circuit (I2C) interfaces. According to the Sensirion specifications, SPS30 exhibits a precision error of ± ( 5 % + 5   μ g / m 3 ) for PM2.5 concentrations below 100 μg/m3, under reference conditions (25 °C and nominal supply voltage) [34].
The SPS30 incorporates advanced contamination-resistance mechanisms to mitigate the effects of dust accumulation and humidity. To further enhance measurement precision under varying humidity conditions, a calibration model based on the humidity-adjustment equations proposed by the U.S. Environmental Protection Agency (EPA) was applied to the raw PM2.5 measurements, compensating for hygroscopic particle growth that causes artificial overestimation in light-scattering sensors and aligning the results with reference-grade monitors [35,36]. The EPA model addresses this effect through a set of piecewise regression equations that correct the uncalibrated sensor output P M 2.5 , r a w as a function of relative humidity R H . Specifically, for P M 2.5 , r a w < 30 µg/m3, the calibrated concentration P M 2.5 is given by Equation (1).
P M 2.5 = 0.524 × P M 2.5 , r a w 0.0862 × R H + 5.75  
PM2.5 measurements are recorded at a 5 s resolution and transmitted to the microcontroller via the Serial Peripheral Interface (SPI) bus, which facilitates both data logging and wireless communication. The LoRaWAN system-on-chip STM32WL55 was integrated into the mainboard, ensuring low-power operation, long-range connectivity and reliable data transmissions. The wearable device is powered by a 1600 mAh lithium-polymer battery with dual charging options: USB Type-C and a photovoltaic panel, allowing for extended operation without frequent recharging.
As presented in Figure 1, the wearable device was worn on the upper right arm of a municipal office worker in Trikala, Greece, conducting continuous 8 h monitoring sessions over a 3-month period during typical morning office hours. The arm-worn configuration was considered as the optimal placement strategy, as it balanced the need to maintain proximity to the respiratory intake area while minimizing discomfort, obtrusiveness, and workflow disruption. Data acquired from the wearable device was published to a dedicated topic of an MQTT broker and stored in a PostgreSQL time-series database, resulting in a comprehensive data collection with 518,512 samples of PM2.5 concentrations.
Although a 5 s sampling interval captures fine-scale fluctuations essential for real-time IAQ monitoring, it introduces two major challenges. First, high granularity time-series are prone to short-term noise and sensor artifacts that may not reflect meaningful variations. Second, from an ML perspective, such a high temporal resolution increases computational overhead and the risk of overfitting to noise. To address these challenges, a conservative aggregation strategy was employed, whereby batches of 12 consecutive samples were combined into a single representative one using median as the aggregation function. The resampling process yielded a reduced dataset D with D = 43,209 .
Let y t R + denote the measured PM2.5 concentration at a specific time step t . Then, the univariate input vector sequence y i for the LSTM model, starting at position i with a temporal context window of length k , is formulated as in Equation (2).
y i = y i ,   y i + 1 ,   ,   y i + k 1 T R + k  
Let h denote the horizon at which a single-step prediction is produced by the LSTM model parametrized by θ . Then, the estimated PM2.5 concentration at that horizon is defined as in Equation (3), where S D represents the set of input sequences used for model training.
y ^ t + h = F ( y i ; θ ) R + ,   y i S  

3.2. Distributional Characteristics for PM2.5 Levels

As a preliminary step, the principal distributional characteristics of D were explored to gain insights into the central tendency and variability of the PM2.5 measurements, with the results illustrated by the boxplot in Figure 2. The mean concentration was 8.44 ± 2.91   μ g / m 3 , corresponding to a coefficient of variation of 34 % , which indicated pronounced PM2.5 fluctuations during the monitoring period. This considerable dispersion was advantageous for the model training, as it exposed the model to diverse and non-stationary PM2.5 patterns. Also, the average PM2.5 concentration remained well below the acceptable upper limit of 15.00   μ g / m 3 reflected in European standards [9] for long-term exposure; however, it exceeded the more restrictive WHO guideline ( 5.00   μ g / m 3 ) . The median PM2.5 concentration at 8.10   μ g / m 3 suggested highly symmetric central distribution.
The quartile breakdown revealed a narrow interquartile range (IQR) of 3.70   μ g / m 3 , with Q 1 positioned at 6.40   μ g / m 3 and Q 3 positioned at 10.00   μ g / m 3 , implying limited dispersion among the central 50% of measurements. Despite that narrow IQR, multiple outliers were identified, predominantly above the upper whisker at Q 3 + 1.5 × I Q R , indicative of intermittent spikes in PM2.5 concentrations over the monitoring period. This outlier prevalence allowed for a more rigorous evaluation for the model’s robustness beyond typical patterns governing PM2.5 concentrations, such as extreme contamination events. The minimum concentration observed was 1.20   μ g / m 3 , while the maximum reached 39.80   μ g / m 3 .

3.3. Exploration of Linear Dependencies and Sharing Information in PM2.5 Concentrations

3.3.1. Autocorrelation Patterns in PM2.5 Concentrations

Prior to autocorrelation analysis, the Augmented Dickey–Fuller (ADF) statistical test was conducted to assess the stationarity of the time-series, a necessary condition for valid autocorrelation inference. The test examined the null hypothesis of a unit root, whose presence would indicate non-stationarity driven by stochastic trends or random-walk components. With an ADF statistic of τ A D F = 8.12 , well below the 1 % critical value of 3.431 , and a p-value 1 × 10 12 , the results provided strong statistical evidence to reject the null-hypothesis of a unit root, thereby confirming the stationarity assumption.
To analyze the temporal patterns under the stationarity assumption for the time-series PM2.5 measurements, Autocorrelation (ACF) and Partial Autocorrelation Functions (PACF) were used, with their corresponding plots presented in Figure 3. Aligned with the targeted short-term 5 min forecasting horizon, the lookback windows for both ACF and PACF were restricted up to 30 lagged values, representing PM2.5 dynamics in the immediate human surroundings over a 30 min period.
The ACF revealed strong and persistent autocorrelations in PM2.5 concentrations across the examined 30 min interval, with correlation coefficients r A C F > 0.85 . The gradual decline pattern in ACF values further supported the presence of long-memory characteristics in the time-series. In contrast, the PACF identified a dominant partial autocorrelation at lag-1 and a moderate secondary effect at lag-2. For subsequent lags, coefficients declined toward zero, except at lag-6, where a notable resurgence ( r A C F 0.25 ) suggested a potential periodic component. In summary, the most recent PM2.5 concentrations emerged as the primary predictive drivers for short-term horizons, while higher-order lag effects were generally negligible.

3.3.2. Mutual Information in PM2.5 Concentrations

A limitation of ACF and PACF is that they primarily capture linear temporal dependencies, whereas more complex non-linear relationships may exist across different lag intervals in PM2.5 concentrations. Despite this limitation, the immediate lags at 1, 2, and 5 steps warrant particular attention due to their strong linear predictive signals.
The Mutual Information (MI) measure was employed to quantify the shared information between the current PM2.5 levels and their lagged counterparts up to 30 steps prior. This information-theoretic measure does not assess non-linearity per se, but determines the total statistical dependence through uncertainty reduction in current PM2.5 concentrations given knowledge of preceding ones. Consequently, when analyzed in conjunction with the ACF and PACF, it facilitates the identification of latent temporal and non-linear dependencies in PM2.5 dynamics.
Figure 4 illustrates the results of the MI analysis. At the shortest lags, MI revealed pronounced shared information among PM2.5 concentrations, particularly up to lag-5, with MI scores of 1.95, 1.68, 1.49, 1.35, and 1.25. From lag-5 onwards, MI scores reached a plateau around 1.10, demonstrating weaker yet detectable dependencies. These results contrasted with those obtained from the PACF, where linear dependencies diminished after lag-2, except for a resurgence at lag-5, suggesting the presence of non-linear components.

3.4. Modeling PM2.5 Using an Attention-Based LSTM Architecture

The dataset D was partitioned into training D t r a i n and testing subsets D t e s t using an 85/15 split ratio that preserved the chronological structure of the time-series, allocating the most recent 15% of the measurements to D t e s t . The resulting holdout set, comprising 6481 unseen samples, provided an unbiased estimate of the model’s generalization performance. A walk-forward validation was applied to D t r a i n using 5 temporal splits to reduce the period-specific bias inherent in single chronological train-validation splits, thereby enhancing generalization robustness through multiple independent evaluations across distinct time intervals and avoiding data leakage. Each fold involved expanding training windows and non-overlapping sequential validation sets, maintaining a 70/30 ratio and progressively incorporating additional measurements for training.
An attention-based LSTM model was employed to produce single-step predictions of PM2.5 concentration, denoted as y ^ t + 5 R + , at a forecasting horizon of h = 5 min, using univariate input sequences y i R + 10 consisting of k = 10 consecutive PM2.5 measurements. This 10 min lookback window was selected according to the findings reported in Section 3.3, which evidenced strong temporal dependencies in PM2.5 concentrations over this time range. Specifically, the ACF exhibited a persistent autocorrelation pattern with a gradual decay extending up to 30 min, while the PACF revealed that the dominant direct temporal dependencies were mainly confined to short lags (up to two minutes), with a secondary peak at lag-6. MI corroborated these findings, exhibiting asymptotic information convergence well before 10 min, thereby confirming that the key predictive information resides within the selected temporal window.
The proposed attention-based LSTM model integrated a deep, stacked LSTM encoder consisting of multiple gated recurrent layers that capture hierarchical temporal patterns, coupled with a custom positional attention mechanism designed to introduce a recency bias during sequence aggregation. Let L Z + denote the depth of the stacked LSTM encoder and h t ( l ) R d represent the d -dimensional hidden state of the l -th LSTM layer at time step t 1 , , 10 . For an input sequence y i R + 10 , Equation (4) defines the computations in the first layer, where y i , t R + denotes the PM2.5 concentration at time step t ; Equation (5) specifies each subsequent layer; and Equation (6) yields the model output H R 10 × d .
h t ( 1 ) = L S T M ( 1 ) y i ,   t , h t 1 ( 1 )   ,   w i t h   h 0 ( 1 ) = 0
h t ( l ) = L S T M ( l ) h t ( l 1 ) , h t 1 ( l ) ,   w i t h   h 0 ( l ) = 0   a n d   l 2 , , L
H = h 1 ( L ) , , h 10 ( L ) R 10 × d  
The custom positional attention mechanism implemented a learnable weighting scheme that encouraged the model to focus on the most informative timesteps within the input sequencies through explicit recency bias injection. Grounded in prior findings indicating that the most recent PM2.5 measurements carry the highest predictive importance, this mechanism combined a trainable attention weight vector w R d with a monotonically increasing deterministic temporal bias to prioritize the most recent hidden states. Softmax normalization was applied to attention scores e t computed over the sequence of hidden states h t ( L ) R d from the final LSTM layer, allowing the mechanism to perform differentiable, adaptive temporal aggregation across the entire lookback window. Let b t R denote the learned bias for each position t and β t R denote the fixed positional bias for each time step t . The, the computation of the attention scores, normalized weights, and context vector follows Equations (7)–(9).
e t = t a n h w T h t ( L ) + b t + β t R
a t = e x p e t k = 1 10 e x p e k   ( 0,1 )   w i t h   t = 1 10 a t = 1  
c = t = 1 10 a t h t ( L ) R d
The output layer of the proposed LSTM architecture consists of a fully connected dense layer R d R + that maps the attention-aggregated context vector c R d to the final prediction y ^ t + 5 R + . It applies a linear transformation with a learned weight vector w o R d and bias term b o R to the context vector c , yielding a scalar output corresponding to the predicted PM2.5 concentration at the desired 5 min forecasting horizon, as defined in Equation (10).
y ^ t + 5 = w o T c + b o   R +
During training, the LSTM variants were optimized to minimize the loss function L of the Mean Squared Error (MSE), as described in Equation (11), where B denotes the batch size. In addition, early stopping was applied for model regularization, terminating the training at epoch T when the validation loss L v a l plateaued using a patience of 4 epochs.
L = 1 B × j = 1 B y ^ ( j ) t + 5 y ( j ) t + 5 2  
Figure 5 depicts the proposed attention-based LSTM architecture. The model ingests univariate input sequences of 10 consecutive PM2.5 measurements, representing a 10 min lookback window. Two stacked LSTM layers encode temporal dependencies and produce hidden states for each time step, which are subsequently weighted by a positional attention mechanism (softmax-normalized) according to their relevance. The resulting context vector aggregates the weighted hidden representations and is passed through a dense output layer to generate the forecasted PM2.5 concentration 5 min ahead.

3.5. Hyperparameter Tuning for Attention-Based LSTM Performance

Given the prohibitive computational overhead introduced when pursuing exhaustive grid searches during the hyperparameter tuning process in DL problems, the present study adopted a history-aware Bayesian optimization approach that strategically navigated across a mixed discrete-continuous high-dimensional hyperparameter space H , facilitating the early pruning of unpromising architectures and configurations. The key idea behind this approach is to model the probability distributions of different hyperparameter configuration vectors h H conditioned on their observed predictive performance utilizing the Tree-structured Parzen Estimators (TPE) technique. In each iteration, an acquisition-based optimization problem was formulated to inform the selection of subsequent candidate configuration vectors h toward promising regions of the hyperparameter space H , likely to minimize an objective cost-function J h reflecting the validation loss, as expressed in Equation (12).
h = argmin h H J h
The coarse-to-fine paradigm was followed to initially explore a broad hyperparameter space H 1 comprising multiple hierarchical sequential LSTM structures, learning rates, regularization strengths and patterns. Subsequently, a narrower hyperparameter space H 2 was constructed, concentrated around those regions that proximate the global minimum of the objective cost-function J h in terms of validation error.
For the initial space H 1 , shallow-to-deep sequential LSTM architectures were investigated that comprised L 1 3 ,   4 ,   5 layers with a monotonical decreasing pattern of u 1 128 ,   64 ,   32 ,   16 ,   8 units per layer. These configuration parameters were selected to align with the limited problem’s temporal scope. To further mitigate overfitting risk, different dropout strategies d 1 , i were explored with rates r { 0.0 ,   0.1 ,   0.2 ,   0.3 } , including (i): no-dropout d 1,1 with all rates equal to zero, (ii): uniform dropout d 1,2 with equal rates across the layers, (iii): pyramid dropout d 1,3 with peak rates at the middle layers, and (iv): terminal-heavy dropout d 1,4 with stronger regularization at the terminal layer. Learning rate and L2 regularization strength encompassed values within continuous ranges, restricted to η 1 10 4 , 10 2 and λ 1 10 5 , 10 2 . The complete configuration space H 1 investigated for the initial model search is summarized in Equation (13).
H 1 = ( L , u , d , η λ ) L L 1 , u u 1 , d d 1 , η η 1 , λ λ 1  
For the refined space H 2 , a tightened grid of parameters was explored capitalizing on the results from the n = 5 best-performing configurations in H 1 . A key distinction from the previous search was that learning rates η 2 η 1 and L2 regularization strengths λ 2 λ 1 were restricted to a discrete set of two values, representing those overperformed across the prior explorations in H 1 . The refined configuration space H 2 explored is defined in Equation (14).
H 2 = ( L , u , d , η ,   λ ) L L 2 , u u 2 , d d 2 , η η 2 , λ λ 2  

3.6. Evaluation Metrics for Attention-Based LSTM Performance

The implemented attention-based LSTM variants were evaluated using the Mean Absolute Error (MAE), defined in Equation (15), where N = D t e s t denotes the total number of samples in D t e s t . This metric quantifies the average magnitude of absolute errors regardless their direction, that is, the absolute difference between the actual y t + 5 and the predicted y ^ t + 5 . MAE was also employed as a complementary metric to the training and validation losses during each training epoch.
M A E = 1 N × j = 1 N y ^ ( j ) t + 5 y ( j ) t + 5  

4. Results

Section 4 reports the comparative performance of attention-based LSTM variants in forecasting PM2.5 concentrations 5 min ahead. Section 4.1 outlines the hyperparameter optimization process. Section 4.2 then presents the training and validation of the three best-performing model configurations identified through optimization and assesses their predictive performance on unseen data.

4.1. Results from Hyperparameter Tuning for Attention-Based LSTM

An iterative optimization process with 40 trials was conducted to explore configurations across the hyperparameter space H 1 , aiming to minimize the validation MAE. Each trial comprised 3 independent training runs to account for stochasticity introduced by random weight initializations. All models were trained for up to 150 epochs, with an early stopping mechanism applied using a patience of 3 epochs and an improvement threshold of 1 × 10 3 to prevent overfitting.
In absence of prior domain knowledge, the search across H 1 initiated with a random configuration, yielding a validation MAE of 0.191   μ g / m 3 in Trial 1 that outperformed all the subsequent trials up to Trial 15. Marginal performance gains were observed in Trial 15 with the validation MAE reducing to 0.188   μ g / m 3 under a configuration comprising a 3-layer LSTM architecture with [8,16,32] units, a terminal-heavy dropout strategy with rates of 0.0, 0.1, and 0.2, η 7 × 10 4 and λ 3 × 10 4 .
Two additional performance improvements occurred until the termination of the optimization process in H 1 . In Trial 23, the validation MAE was reduced at 0.187   μ g / m 3 , adjusting the dropout strategy from terminal-heavy to no-dropout, and η 8 × 10 4 . The most pronounced improvement detected in Trial 35 in which the validation MAE was reduced at 0.184   μ g / m 3 , refining model’s regularization using pyramid dropout strategy with rates 0.0, 0.1, and 0.0, and adjusting λ 9 × 10 4 and η 4 × 10 4 . Throughout the exploration of H 1 , the validation MAE remained consistently within the narrow band of 0.184   μ g / m 3 and 0.194   μ g / m 3 , indicating stable convergence near the global minimum.
Based on the preceding results, the top-performing configurations were used to construct the reduced hyperparameter space H 2 , including 3- and 4-layer architectures with no-dropout, pyramid and terminal-heavy dropout strategies. The dropout rates were restricted to the discrete set r { 0.0 ,   0.1 ,   0.2 } , as weaker regularization produced lower validation MAE. Learning rates and L2 regularization strengths were not arbitrarily chosen; instead, they were limited to discrete sets, representative of the best performed magnitude orders: η { 8 × 10 4 , 4 × 10 4 , 1 × 10 4 } and λ { 9 × 10 4 , 3 × 10 4 , 1 × 10 4 } .
The optimization process in H 2 commenced with an initial validation MAE of 0.187   μ g / m 3 , which was marginally improved in Trial 3. In this trial, a 3-layer LSTM architecture with [8,16,32] units per layer decreased the validation MAE by 0.001   μ g / m 3 utilizing uniform dropout rates of 0.2 across the layers, λ 3 × 10 4 and η 4 × 10 4 . In Trial 24, the same architecture employed the pyramid dropout strategy with dropout rates of 0.1, 0.2, and 0.1, λ 3 × 10 4 , and η 8 × 10 4 , resulting in an improved validation MAE improvement at 0.179   μ g / m 3 , beyond which no further gains were observed.

4.2. Results from Training Process for the Attention-Based LSTM Models

Following the hyperparameter optimization in H 2 , the three best-performing attention-based LSTM variants were derived for subsequent modeling and evaluation, corresponding to the configurations yielding the lowest MAE (Table 1). These variants were trained with a batch size of 32 for up to 150 epochs, using an early-stopping patience of 3 epochs and a minimum improvement threshold of 1 × 10 3 in validation loss.
The configuration C1 processed univariate input sequences of 10 consecutive PM2.5 measurements through three stacked LSTM layers with 32, 16, and 8 units, respectively. It employed a pyramid dropout scheme with rates of 0.1, 0.2, and 0.1, and used a learning rate of 8 × 10 4 and an L2 regularization strength of 3 × 10 4 . This configuration exhibited a tri-phasic smooth monotonic convergence pattern in both training and validation curves, as illustrated in Figure 6. An initial rapid descent was observed for both curves up to epoch 30, followed by gradual refinement over the subsequent 70 epochs, and a stable asymptotic behavior until epoch 120. The consistent gap between the curves and the absence of oscillations indicated stable learning without signs of overfitting. The early stopping mechanism was triggered after 120 epochs, restoring the LSTM model from epoch 117 for subsequent evaluation. The model yielded a training MAE of 0.171 μg/m3 and a validation MAE of 0.177 μg/m3.
The configuration C2 processed univariate input sequences of 10 consecutive PM2.5 measurements through three stacked LSTM layers with 32, 16, and 8 units, respectively. It employed a unform dropout strategy with rate of 0.2, using a learning rate of 4 × 10 4 and an L2 regularization strength of 3 × 10 4 . As illustrated in Figure 7, this configuration exhibited congruent learning patterns with those of configuration C1, reaching an earlier plateau and being terminated by the early-stopping mechanism at epoch 84. A wider gap between training and validation curves was observed during the initial decay phase (25 epochs), followed by gradual refinement of both losses over the next 40 epochs. C2 showed a more resource-efficient learning process while achieving nearly identical predictive performance to the previous configuration. The model yielded a training MAE of 0.174 μg/m3 and a validation MAE of 0.179 μg/m3.
The configuration C3 processed univariate input sequences of 10 consecutive PM2.5 measurements through three stacked LSTM layers with 32, 16, and 8 units, respectively. It employed a pyramid dropout strategy with rates of 0.0, 0.2, and 0.0, and used a learning rate of 4 × 10 4 and an L2 regularization strength of 9 × 10 4 . This configuration revealed a distinct learning pattern compared to the previous ones, as illustrated in Figure 8. In particular, it showed accelerated convergence during the initial 10 epochs, despite using the same learning rate as in configuration C2. This steep decay pattern was likely attributable to reduced regularization constraints, reflected in the zero dropout rates of both the initial and terminal layers, partially offset by the higher L2 regularization strength. Unlike the previous configurations, C3 displayed a bi-phasic learning behavior, characterized by a rapid initial descent immediately followed by an early plateau, without gradual intermediate refinement. Erratic oscillations in the validation trajectory between epochs 10 and 20 indicated that performance gains in the training set were not consistently mirrored in the validation set. Training was terminated at epoch 30 by the early-stopping mechanism due to the absence of meaningful validation improvements. The model yielded a training MAE of 0.178 μg/m3 and a validation MAE of 0.193 μg/m3.
Table 2 summarizes the quantitative performance of the three configurations across the training and validation. The results highlighted consistent learning dynamics among the models, with configuration C1 achieving the most stable convergence and the lowest MAE, followed by C2 with marginally higher errors and C3 showing reduced learning stability.

4.2.1. Performance Evaluation for the Attention-Based LSTM Models

The validation and test MAE that obtained from the three different hyperparameter configurations C1, C2, and C3 are presented in Figure 9.
The configuration C1 achieved the optimal predictive performance, corresponding to the lowest validation and test MAE of 0.177   μ g / m 3 and 0.181   μ g / m 3 , respectively. The consistent MAE in both validation and test sets with a 2.3% relative difference indicated robust model’s generalized and further confirmed its stable learning.
The configuration C2 demonstrated a comparable MAE of 0.179   μ g / m 3 in the validation set, although it resulted in a 4.5% higher MAE in the test set at 0.187   μ g / m 3 . This performance degradation on unseen data highlighted that the hyperparameter set for configuration C2 was more susceptible to overfitting and the training for fewer epochs limited the model’s learning capacity in more generic temporal dependencies.
The configuration C3 resulted in inferior performance compared to the previous ones, resulting in a validation MAE of 0.193   μ g / m 3 and a testing MAE of 0.207   μ g / m 3 , representing a performance degradation by 14.4% degradation when compared to the optimal configuration C1. The root cause was the truncated, bi-phasic learning and the erratic validation curve that led to premature overfitting. Despite the higher L2 regularization that attempted to compensate, the marginal validation improvements after epoch 10 and the early termination at epoch 30 further limited model’s generalization.

4.2.2. Visualization of the Attention-Based LSTM Models Predictive Accuracy

Figure 10 presents the comparative overview of predicted versus actual PM2.5 concentrations for the three attention-based LSTM configurations, which accounted for a representative 50% of the hold-out test set with 195 non-overlapping single-step predictions.
The intuitive, visual examination provided convincing evidence for the superior predictive performance of configurations C1 and C2 compared to C3 throughout the time-series. Given the minimal residuals, the negligible overshoots and lags, and the close temporal tracking of the ground truth PM2.5 concentrations under steady-state conditions, both configurations established themselves as robust operational modeling approaches. The 4.5% MAE advantage of configuration C1 over C2 stems from its enhanced forecasting precision during stable pollution conditions, as both configurations achieved immediate magnitude and timing adaption to the transient spiked occurred at the final temporal segment.
In contrast, the configuration C3 demonstrated more pronounced deviations from the actual values and diminished temporal responsiveness. This configuration particularly struggled to capture the temporal onset and magnitude at the extreme contamination event, and during episodic, lower-amplitude spikes that occurred in the intermediate intervals. Despite these performance limitations, the configuration C3 preserved the general time-series trend and presented compelling computational advantages. Therefore, it constitutes a lightweight alternative for applications tolerating higher prediction uncertainty in exchange for reduced computational resources, such as continuous monitoring systems where approximate trend identification suffices.

5. Conclusions

This study introduced a human-centric approach for assessing short-term PM2.5 exposure, integrating wearable sensing with deep learning to translate predictive analytics into personalized IAQ insights. Τhe proposed approach employed a low-cost, arm-worn wearable device equipped with a Sensirion SPS30 sensor to unobtrusively capture PM2.5 dynamics within the human microenvironment. By functioning at the human-level, this device eliminated the need for spatial interpolation from room-level PM2.5 concentrations to individual exposure estimates, thereby reducing the demand for dense sensor deployments and computational resources. An LSTM with attention mechanism complemented the sensing module, modeling the temporal dependencies that govern human-level PM2.5 dynamics.
Το experimentally evaluate the proposed framework, three-month monitoring campaign was conducted in Trikala, Greece, that involved 8 h daily sessions with a municipal office worker, yielding 518,512 PM2.5 samples collected at 5 s intervals. As presented in Section 3.2, the examined office space exhibited relatively stable IAQ dynamics, while occasional contamination spikes served to evaluate model performance under stress conditions. As presented in Section 3.3, the most pronounced dependencies in PM2.5 concentrations observed at the immediate lags, maintaining significant autocorrelation and shared information persistence for up to 30 min. Accordingly, using a 10 min historical context for predicting PM2.5 concentrations 5 min ahead proved to be an effective trade-off between temporal context length and computational overhead.
Following Bayesian hyperparameter optimization, the three best-performing attention-based LSTM variants were evaluated, each adopting a distinct architecture, regularization strategy, and learning rate. A 3-layer LSTM model comprising [8,16,32] units per layer and a pyramid dropout with rates 0.0, 0.1, and 0.0, λ 3 × 10 4 and η 8 × 10 4 achieved the optimal performance, reporting an MAE of 0.183 μg/m3. Comparable performance, with a MAE higher only by 0.004 μg/m3, was obtained from a 3-layer LSTM model employing uniform dropout rates of 0.2, λ 3 × 10 4 and η 4 × 10 4 .
Beyond predictive accuracy, practical considerations related to device operation and user experience were also assessed. The wearable device was powered by a 1600 mAh lithium-polymer battery featuring dual charging options—USB Type-C and an integrated photovoltaic panel—which enabled continuous monitoring for three months with only five recharging cycles. The SPS30 sensor maintained stable performance throughout this period, supported by its built-in contamination-resistance mechanisms and the applied EPA-based humidity calibration that ensured measurement reliability under varying environmental conditions. The participant reported satisfactory comfort and unobtrusiveness during daily use; however, future iterations should emphasize further miniaturization and ergonomic improvements to enhance long-term user acceptability in real-world deployments.
Multiple research avenues warrant future investigation to advance and further elaborate this promising forecasting approach. First, the predictive performance of attention-based LSTM models across more extended horizons should be prioritized, also incorporating multi-variant input sequencies that comprises additional predictors for PM2.5 concentrations, including climatic conditions, gaseous pollutants, and occupancy status. Second, more sophisticated DL architectures should also be explored, such as Transformer networks and ensemble learning approaches, due to their strong capacity to model intricate temporal patterns. Third, to establish the proposed model’s robustness and generalization, it should be replicated and validated across indoor settings that experience diverse pollution concentration pattern and environmental conditions. Finally, the practical implementation of the proposed LSTM model with attention mechanism should be pursued in real-world environments, particularly through its integration with epidemiological and other health outcome models to enable personalized exposure-health risk assessment.

Author Contributions

Conceptualization, C.M. and J.G.; Methodology, C.M.; Software, C.M. and G.P.; Validation, C.M., G.P. and J.G.; Investigation, C.M. and G.P.; Data curation, C.M. and G.P.; Supervision, J.G.; Project administration, J.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the TwinAIR project No. 101057779, “Indoor air quality and health”, HORIZON-HLTH-2021-ENVHLTH-02-02.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy restrictions.

Use of Artificial Intelligence

AI or AI-assisted tools were not used in drafting any aspect of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. World Health Organization (WHO). Air Quality, Energy and Health. Available online: https://www.who.int/teams/environment-climate-change-and-health/air-quality-energy-and-health/sectoral-interventions/household-air-pollution/health-risks (accessed on 25 July 2025).
  2. European Commission. Air Quality. Available online: https://environment.ec.europa.eu/topics/air/air-quality_en (accessed on 25 July 2025).
  3. United Nations. World Environment Situation Room. United Nations Environment Programme. Available online: https://unece.org/environmental-policy-1/air (accessed on 25 July 2025).
  4. Vardoulakis, S.; Giagloglou, E.; Steinle, S.; Davis, A.; Sleeuwenhoek, A.; Galea, K.S.; Dixon, K.; Crawford, J.O. Indoor Exposure to Selected Air Pollutants in the Home Environment: A Systematic Review. Int. J. Environ. Res. Public Health 2020, 17, 8972. [Google Scholar] [CrossRef] [PubMed]
  5. Xing, Y.F.; Xu, Y.H.; Shi, M.H.; Lian, Y.X. The Impact of PM2.5 on the Human Respiratory System. J. Thorac. Dis. 2016, 8, E69–E74. [Google Scholar] [PubMed]
  6. Miller, L.; Xu, X. Ambient PM2.5 Human Health Effects—Findings in China and Research Directions. Atmosphere 2018, 9, 424. [Google Scholar] [CrossRef]
  7. Lu, F.; Xu, D.; Cheng, Y.; Dong, S.; Guo, C.; Jiang, X.; Zheng, X. Systematic Review and Meta-Analysis of the Adverse Health Effects of Ambient PM2.5 and PM10 Pollution in the Chinese Population. Environ. Res. 2015, 136, 196–204. [Google Scholar] [CrossRef]
  8. Voultsidis, D.; Gialelis, J.; Protopsaltis, G.; Bali, N.; Mountzouris, C. Utilizing Unobtrusive Portable Electronic Devices for Real-Time Assessment of Indoor PM2.5 and tVOC Exposure and Its Correlation with Heart Rate Variability. Procedia Comput. Sci. 2023, 224, 550–557. [Google Scholar] [CrossRef]
  9. Settimo, G.; Yu, Y.; Gola, M.; Buffoli, M.; Capolongo, S. Challenges in IAQ for Indoor Spaces: A Comparison of the Reference Guideline Values of Indoor Air Pollutants from Governments and International Institutions. Atmosphere 2023, 14, 633. [Google Scholar] [CrossRef]
  10. World Health Organization (WHO). Household Air Pollution and Health. Available online: https://www.who.int/news-room/fact-sheets/detail/household-air-pollution-and-health (accessed on 25 July 2025).
  11. Hernández-Gordillo, A.; Ruiz-Correa, S.; Robledo-Valero, V.; Hernández-Rosales, C.; Arriaga, S. Recent Advancements in Low-Cost Portable Sensors for Urban and Indoor Air Quality Monitoring. Air Qual. Atmos. Health 2021, 14, 1931–1951. [Google Scholar] [CrossRef]
  12. Zaidan, M.A.; Motlagh, N.H.; Boor, B.E.; Lu, D.; Nurmi, P.; Petäjä, T.; Ding, A.; Kulmala, M.; Tarkoma, S.; Hussein, T. Virtual Sensors: Toward High-Resolution Air Pollution Monitoring Using AI and IoT. IEEE Internet Things Mag. 2023, 6, 76–81. [Google Scholar] [CrossRef]
  13. Adhikari, A.; Hussain, C.M. From Detection to Solution: A Review of Machine Learning in PM2.5 Sensing and Sustainable Green Mitigation Approaches (2021–2025). Processes 2025, 13, 2207. [Google Scholar] [CrossRef]
  14. Si, M.; Xiong, Y.; Du, S.; Du, K. Evaluation and Calibration of a Low-Cost Particle Sensor in Ambient Conditions Using Machine-Learning Methods. Atmos. Meas. Tech. 2020, 13, 1693–1707. [Google Scholar] [CrossRef]
  15. Zimmerman, N.; Presto, A.A.; Kumar, S.P.N.; Gu, J.; Hauryliuk, A.; Robinson, E.S.; Robinson, A.L.; Subramanian, R. A Machine Learning Calibration Model Using Random Forests to Improve Sensor Performance for Lower-Cost Air Quality Monitoring. Atmos. Meas. Tech. 2018, 11, 291–313. [Google Scholar] [CrossRef]
  16. Wijeratne, L.O.H.; Kiv, D.R.; Aker, A.R.; Talebi, S.; Lary, D.J. Using Machine Learning for the Calibration of Airborne Particulate Sensors. Sensors 2020, 20, 99. [Google Scholar] [CrossRef] [PubMed]
  17. Karatzas, S.; Merino, J.; Puchkova, A.; Mountzouris, C.; Protopsaltis, G.; Gialelis, J.; Parlikad, A.K. A Virtual Sensing Approach to Enhancing Personalized Strategies for Indoor Environmental Quality and Residential Energy Management. Build. Environ. 2024, 261, 111684. [Google Scholar] [CrossRef]
  18. Fan, J.; Chu, H.; Liu, L.; Ma, H. LLMAir: Adaptive Reprogramming Large Language Model for Air Quality Prediction. In Proceedings of the 2024 IEEE 30th International Conference on Parallel and Distributed Systems (ICPADS), Belgrade, Serbia, 17–19 December 2024; pp. 423–430. [Google Scholar]
  19. Patel, Z.B.; Bachwana, Y.; Sharma, N.; Guttikunda, S.; Batra, N. VayuBuddy: An LLM-Powered Chatbot to Democratize Air Quality Insights. arXiv 2024, arXiv:2411.12760. Available online: https://arxiv.org/abs/2411.12760 (accessed on 10 September 2025).
  20. Huang, C.-J.; Kuo, P.-H. A Deep CNN-LSTM Model for Particulate Matter (PM2.5) Forecasting in Smart Cities. Sensors 2018, 18, 2220. [Google Scholar] [CrossRef]
  21. Zaini, N.; Ean, L.W.; Ahmed, A.N.; Malek, M.A.; Chow, M.F. PM2.5 Forecasting for an Urban Area Based on Deep Learning and Decomposition Method. Sci. Rep. 2022, 12, 17565. [Google Scholar] [CrossRef]
  22. Bai, Y.; Zeng, B.; Li, C.; Zhang, J. An Ensemble Long Short-Term Memory Neural Network for Hourly PM2.5 Concentration Forecasting. Chemosphere 2019, 222, 286–294. [Google Scholar] [CrossRef]
  23. Chae, S.; Shin, J.; Kwon, S.; Lee, S.; Kang, S.; Lee, D. PM10 and PM2.5 Real-Time Prediction Models Using an Interpolated Convolutional Neural Network. Sci. Rep. 2021, 11, 11952. [Google Scholar] [CrossRef]
  24. Prihatno, A.T.; Nurcahyanto, H.; Ahmed, M.F.; Rahman, M.H.; Alam, M.M.; Jang, Y.M. Forecasting PM2.5 Concentration Using a Single-Dense Layer BiLSTM Method. Electronics 2021, 10, 1808. [Google Scholar] [CrossRef]
  25. Dai, X.; Liu, J.; Li, Y. A Recurrent Neural Network Using Historical Data to Predict Time Series Indoor PM2.5 Concentrations for Residential Buildings. Indoor Air 2021, 31, 1228–1237. [Google Scholar] [CrossRef]
  26. Karaiskos, P.; Munian, Y.; Martinez-Molina, A.; Alamaniotis, M. Indoor Air Quality Prediction Modeling for a Naturally Ventilated Fitness Building Using RNN-LSTM Artificial Neural Networks. Smart Sustain. Built Environ. 2024. ahead-of-print. [Google Scholar] [CrossRef]
  27. Lagesse, B.; Wang, S.; Larson, T.V.; Kim, A.A. Performing Indoor PM2.5 Prediction with Low-Cost Data and Machine Learning. Facilities 2022, 40, 495–514. [Google Scholar] [CrossRef]
  28. Singh, A.; Islam, M.; Dinh, N. Forecasting Indoor Air Quality Using Machine Learning Models. In Proceedings of the 2024 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 6–8 January 2024; pp. 1–6. [Google Scholar]
  29. Dai, H.; Liu, Y.; Wang, J.; Ren, J.; Gao, Y.; Dong, Z.; Zhao, B. Large-Scale Spatiotemporal Deep Learning Predicting Urban Residential Indoor PM2.5 Concentration. Environ. Int. 2023, 182, 108343. [Google Scholar] [CrossRef]
  30. Utama, I.B.K.Y.; Tran, D.H.; Pamungkas, R.F.; Chung, B.; Jang, Y.M. Predicting Indoor PM2.5 Concentration Using LSTM-BNN in Edge Device. In Proceedings of the 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Bali, Indonesia, 20–23 February 2023; pp. 211–215. [Google Scholar]
  31. Zhang, Z.; Zhang, S. Modeling Air Quality PM2.5 Forecasting Using Deep Sparse Attention-Based Transformer Networks. Int. J. Environ. Sci. Technol. 2023, 20, 13535–13550. [Google Scholar] [CrossRef]
  32. Mahajan, S.; Chen, L.-J.; Tsai, T.-C. Short-Term PM2.5 Forecasting Using Exponential Smoothing Method: A Comparative Analysis. Sensors 2018, 18, 3223. [Google Scholar] [CrossRef]
  33. Sensirion, A.G. Particulate Matter Sensors: Application Notes & Specification Statement. Available online: https://sensirion.com/media/documents/B7AAA101/61653FB8/Sensirion_Particulate_Matter_AppNotes_Specification_Statement.pdf (accessed on 10 September 2025).
  34. Sensirion, A.G. Datasheet: SPS30 Particulate Matter (PM) Sensor. Available online: https://sensirion.com/media/documents/8600FF88/64A3B8D6/Sensirion_PM_Sensors_Datasheet_SPS30.pdf (accessed on 10 September 2025).
  35. Jayaratne, R.; Liu, X.; Thai, P.; Dunbabin, M.; Morawska, L. The Influence of Humidity on the Performance of a Low-Cost Air Particle Mass Sensor and the Effect of Atmospheric Fog. Atmos. Meas. Tech. 2018, 11, 4883–4890. [Google Scholar] [CrossRef]
  36. U.S. Environmental Protection Agency; U.S. Forest Service. AirNow Fire and Smoke Map: Questions and Answers. Available online: https://www.airnow.gov/fasm-info (accessed on 10 September 2025).
Figure 1. The wearable device positioned on the upper right arm, as worn by a municipal office employee during regular work activity.
Figure 1. The wearable device positioned on the upper right arm, as worn by a municipal office employee during regular work activity.
Air 03 00029 g001
Figure 2. Boxplot illustrating the distribution of PM2.5 concentrations during the monitoring period.
Figure 2. Boxplot illustrating the distribution of PM2.5 concentrations during the monitoring period.
Air 03 00029 g002
Figure 3. Autocorrelation (ACF) and partial autocorrelation (PACF) plots of PM2.5 concentrations across 30 lags.
Figure 3. Autocorrelation (ACF) and partial autocorrelation (PACF) plots of PM2.5 concentrations across 30 lags.
Air 03 00029 g003
Figure 4. Mutual Information (MI) scores between current and lagged PM2.5 concentrations across 30 time lags.
Figure 4. Mutual Information (MI) scores between current and lagged PM2.5 concentrations across 30 time lags.
Air 03 00029 g004
Figure 5. The proposed attention-based LSTM architecture for 5 min ahead PM2.5 forecasting.
Figure 5. The proposed attention-based LSTM architecture for 5 min ahead PM2.5 forecasting.
Air 03 00029 g005
Figure 6. The training and validation curves for configuration C1.
Figure 6. The training and validation curves for configuration C1.
Air 03 00029 g006
Figure 7. The training and validation curves for configuration C2.
Figure 7. The training and validation curves for configuration C2.
Air 03 00029 g007
Figure 8. The training and validation curves for configuration C3.
Figure 8. The training and validation curves for configuration C3.
Air 03 00029 g008
Figure 9. Validation and test MAE per configuration.
Figure 9. Validation and test MAE per configuration.
Air 03 00029 g009
Figure 10. Predicted versus actual PM2.5 concentrations for the three configurations.
Figure 10. Predicted versus actual PM2.5 concentrations for the three configurations.
Air 03 00029 g010
Table 1. The three best-performing configurations in H 2 , ranked by their validation MAE.
Table 1. The three best-performing configurations in H 2 , ranked by their validation MAE.
ConfigurationLayers (Dropout Rate)λη
C1 32 ( 0.1 ) 16 ( 0.2 ) 8 ( 0.1 ) 8 × 10 4 3 × 10 4
C2 32 ( 0.2 ) 16 ( 0.2 ) 8 ( 0.2 ) 4 × 10 4 3 × 10 4
C3 32 ( 0.0 ) 16 ( 0.1 ) 8 ( 0.0 ) 4 × 10 4 9 × 10 4
Table 2. Training and validation MAE for the three LSTM configurations.
Table 2. Training and validation MAE for the three LSTM configurations.
ConfigurationTraining MAEValidation MAE
C10.1710.177
C20.1740.179
C30.1780.193
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mountzouris, C.; Protopsaltis, G.; Gialelis, J. Toward Personalized Short-Term PM2.5 Forecasting Integrating a Low-Cost Wearable Device and an Attention-Based LSTM. Air 2025, 3, 29. https://doi.org/10.3390/air3040029

AMA Style

Mountzouris C, Protopsaltis G, Gialelis J. Toward Personalized Short-Term PM2.5 Forecasting Integrating a Low-Cost Wearable Device and an Attention-Based LSTM. Air. 2025; 3(4):29. https://doi.org/10.3390/air3040029

Chicago/Turabian Style

Mountzouris, Christos, Grigorios Protopsaltis, and John Gialelis. 2025. "Toward Personalized Short-Term PM2.5 Forecasting Integrating a Low-Cost Wearable Device and an Attention-Based LSTM" Air 3, no. 4: 29. https://doi.org/10.3390/air3040029

APA Style

Mountzouris, C., Protopsaltis, G., & Gialelis, J. (2025). Toward Personalized Short-Term PM2.5 Forecasting Integrating a Low-Cost Wearable Device and an Attention-Based LSTM. Air, 3(4), 29. https://doi.org/10.3390/air3040029

Article Metrics

Back to TopTop