Spatiotemporal Graph Neural Networks for PM2.5 Concentration Forecasting

Chabalala, Vongani; Rudolph, Craig; Mosala, Karabo; Nkadimeng, Edward Khomotso; Mosomane, Chuene; Mathaha, Thuso; Basu, Pallab; Mahboob, Muhammad Ahsan; Kong, Jude; Bragazzi, Nicola; Atif, Iqra; Kumar, Mukesh; Mellado, Bruce

doi:10.3390/air4010002

Open AccessArticle

Spatiotemporal Graph Neural Networks for PM_2.5 Concentration Forecasting

by

Vongani Chabalala

^1,2,*

,

Craig Rudolph

^1,2,

Karabo Mosala

¹

,

Edward Khomotso Nkadimeng

²

,

Chuene Mosomane

^1,2

,

Thuso Mathaha

^1,2,

Pallab Basu

¹,

Muhammad Ahsan Mahboob

³,

Jude Kong

⁴,

Nicola Bragazzi

⁵

,

Iqra Atif

¹,

Mukesh Kumar

^1,*

and

Bruce Mellado

^1,2,*

¹

University of the Witwatersrand, 1 Jan Smuts Avenue, Johannesburg 2050, South Africa

²

iThemba LABS, National Research Foundation, P.O. Box 722, Somerset West 7129, South Africa

³

Sibanye-Stillwater Digital Mining Laboratory (DigiMine), Wits Mining Institute (WMI), University of the Witwatersrand, Johannesburg 2050, South Africa

⁴

Artificial Intelligence & Mathematical Modeling Lab (AIMM Lab), Dalla Lana School of Public Health, University of Toronto, 155 College St Room 500, Toronto, ON M5T 3M7, Canada

⁵

Laboratory for Industrial and Applied Mathematics (LIAM), Department of Mathematics and Statistics, York University, Toronto, ON M3J 1P3, Canada

^*

Authors to whom correspondence should be addressed.

Air 2026, 4(1), 2; https://doi.org/10.3390/air4010002

Submission received: 26 September 2025 / Revised: 2 December 2025 / Accepted: 23 December 2025 / Published: 13 January 2026

(This article belongs to the Special Issue Air Pollution Exposure and Its Impact on Human Health)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Air pollution, particularly fine particulate matter (PM_2.5), poses significant public health and environmental risks. This study explores the effectiveness of spatiotemporal graph neural networks (ST-GNNs) in forecasting PM_2.5 concentrations by integrating remote-sensing hyperspectral indices with traditional meteorological and pollutant data. The model was evaluated using data from Switzerland and the Gauteng province in South Africa, with datasets spanning from January 2016 to December 2021. Key performance metrics, including root mean squared error (RMSE), mean absolute error (MAE), probability of detection (POD), critical success index (CSI), and false alarm rate (FAR), were employed to assess model accuracy. For Switzerland, the integration of spectral indices improved RMSE from 1.4660 to 1.4591, MAE from 1.1147 to 1.1053, CSI from 0.8345 to 0.8387, POD from 0.8961 to 0.8972, and reduced FAR from 0.0760 to 0.0719. In Gauteng, RMSE decreased from 6.3486 to 6.2319, MAE from 4.4891 to 4.4066, CSI from 0.9555 to 0.9560, and POD from 0.9699 to 0.9732, while FAR slightly increased from 0.0154 to 0.0181. Error analysis revealed that while the initial one-day ahead forecast without spectral indices had a marginally lower error, the dataset with spectral indices outperformed from the two-day ahead mark onwards. The error for Swiss monitoring stations stabilized over longer prediction lengths, indicating the robustness of the spectral indices for extended forecasts. The study faced limitations, including the exclusion of the Planetary Boundary Layer (PBL) height and K-index, lack of terrain data for South Africa, and significant missing data in remote sensing indices. Despite these challenges, the results demonstrate that ST-GNNs, enhanced with hyperspectral data, provide a more accurate and reliable tool for PM_2.5 forecasting. Future work will focus on expanding the dataset to include additional regions and further refining the model by incorporating additional environmental variables. This approach holds promise for improving air quality management and mitigating health risks associated with air pollution.

Keywords:

spatiotemporal graph neural networks; PM_2.5; air pollution

1. Introduction

Air pollution has become a serious challenge in low and middle-income countries (LMICs), such as South Africa, which is highly dependent on coal-fired power plants for energy production. However, it is important to note that air pollution is not limited to LMICs [1,2]. High-income countries such as Switzerland also experience adverse health effects from relatively low levels of PM_2.5 concentration, particularly among vulnerable populations such as children, the elderly, and those with preexisting conditions. Therefore, studying air pollution is crucial to understanding its impact on people’s health and livelihoods [3].

Breathing at high levels of pollutants can lead to cancers and various diseases. Particulate matter smaller than 2.5 microns is one of the main carcinogens in air pollution. These particles are so small that they can penetrate deep into lung tissue and enter the bloodstream, causing dangerous effects on the human body. This pollutant alone causes 7 million deaths annually worldwide [4]. PM_2.5 originates from a diverse range of sources, including fossil fuel combustion, mechanical abrasion processes (such as tire, brake, and road wear), agricultural activities, construction and industrial emissions, as well as natural sources like dust and wildfires. More broadly, PM_2.5 can be emitted directly from primary sources-commonly associated with combustion activities and can also form as secondary particles through chemical reactions in the atmosphere involving precursor gases such as sulfur dioxide, nitrogen oxides, and organic compounds. These organic compounds may be natural or anthropogenic, including those emitted from automobile exhausts [4,5].

In addition to its profound effects on human health and air quality, PM_2.5 also poses a significant threat to the environment with its diverse and pervasive effects [6]. When present in the atmosphere, PM_2.5 contributes to climate change, atmospheric haze, reduced visibility, global warming by absorbing, and scattering sunlight, altering temperature patterns, and intensifying the greenhouse effect. Furthermore, PM_2.5 influences the Earth’s radiation balance, affects atmospheric dynamics, and plays a role in cloud formation, thereby influencing precipitation patterns and overall weather conditions.

The multifaceted environmental consequences of PM_2.5 underscore the urgent need for comprehensive strategies to mitigate its emissions and address its detrimental impact on human health and the broader ecosystem [4,5,6]. Forecasting PM_2.5 concentrations is crucial to implement measures to reduce emissions and mitigate PM_2.5 levels in the atmosphere. The study of air quality is of the utmost importance due to its serious impact on human health, the environment, and many other factors. Therefore, it is necessary to use current information to forecast pollutant concentrations and identify necessary changes to reduce PM_2.5 levels atmosphere [7].

The study of pollutant concentration is inherently a spatiotemporal problem. This is because there is a spatial correlation between the surrounding areas, which affects the pollutant concentration of the studied area. The temporal correlation comes from the fact that current or future pollutant concentrations depend on previous concentration measurements [8]. This is where spatiotemporal graph neural networks (GNNs) show their utility. A spatiotemporal GNN consists of two stages. First is the graph neural network, which combines graph networks. A graph is a mathematical structure consisting of nodes and edges, where the graph and its constituents can have attributes or features that can be updated. Information on the graph is exchanged using a message-passing step, where nodes exchange information along the edges, aggregating this information and updating their own features. This is where spatial correlation comes into play, as a network of air monitoring stations can be represented as a graph, with the nodes being the air monitoring stations and the edges representing the interaction between the stations. The second stage uses a recurrent neural network (RNN), which captures the temporal correlation and helps study the temporal diffusion. The two-stage model used in this study is the PM_2.5-GNN, a domain-knowledge enhanced spatiotemporal GNN.

2. Literature Review

Air pollution is a significant contributor to serious environmental and health concerns, arising from industrial emissions, atmospheric contamination due to climate and traffic factors, and fossil fuel combustion. Recognizing this as a global issue, many countries have established air pollution control stations in various cities to monitor pollutants such as nitrogen dioxide (NO₂), carbon dioxide (CO₂), sulfur dioxide (SO₂), and particulate matter (PM_2.5, PM₁₀). These stations alert residents when pollution levels exceed the threshold. Given the increasing air pollution levels, it is crucial to develop machine learning models that capture data, model it, and predict air pollutant concentrations. Africa, in particular, faces a shortage of reliable air quality sensors for monitoring and predicting PM_2.5 compared to other regions, highlighting the potential for expanding research in air pollution control [9].

As previously mentioned, predicting the concentration of PM_2.5 presents an inherent spatiotemporal challenge. These concentrations are influenced not only by pollutant levels in neighboring areas but also by their past values. This complexity aligns with the nature of a multivariate time series regression problem, as PM_2.5 concentration is impacted by various factors, including weather conditions and contributions from primary and secondary pollutant sources. While these elements are commonly employed as features, there is growing interest in utilizing spectral indices derived from remote-sensing data obtained through satellite imagery. These indices are combinations of spectral bands designed to enhance the sensitivity of satellite observations to specific environmental variables, such as vegetation health, water content, and air quality. Numerous studies have examined the correlation between spectral indices and PM_2.5 concentrations. The underlying concept is based on the interaction of aerosol particles, including PM_2.5, with sunlight, which induces spectral modifications in the atmosphere. These alterations are detectable via remote sensing instruments, leading to the identification of empirical relationships between spectral indices and ground-based PM_2.5 measurements. Specifically, spectral indices with spectral bands positioned in the visible and near-infrared segments of the electromagnetic spectrum show a strong correlation with PM_2.5 concentration.

Dobrae et al. (2020) proposed a method for assessing atmospheric levels, focusing on PM_2.5 and PM₁₀, using Support Vector Regression (SVR), Autoregressive Integrated Moving Average (ARIMA), and Long Short-Term Memory (LSTM) models. Their comparative analysis revealed that SVR and ARIMA were the most effective methods for predicting air pollutant concentrations [10]. In a separate study, Tshepang et al. (2023) employed various machine learning models, including Support Vector Machine, Decision Tree, Logistic Regression, K-Nearest Neighbors, CatBoost Regressor, Extreme Gradient Boosting Regressor, and Random Forest Classifier, to evaluate and predict PM_2.5 behavior. The CatBoost Regressor stood out as the most effective for PM_2.5 predictions, while both Random Forest Classifier and Decision Tree were identified as equally successful for determining Air Quality Index (AQI) status [11].

Sangwon et al. (2021) introduced a real-time prediction model involving data interpolation and a Convolutional Neural Network (CNN) for predicting PM₁₀ and PM_2.5 concentrations. They demonstrated high performance through the use of spatiotemporal information and suggested a novel approach in prediction methodology [12]. Chen et al. (2023) proposed a novel approach to predict PM_2.5 concentrations by combining a Convolutional Neural Network (CNN) with a Random Forest (RF) model [13]. In their study, CNN was employed to extract essential meteorological and pollution data gathered from 13 monitoring stations in Kaohsiung, while the RF algorithm was used for PM_2.5 prediction. Evaluation based on root mean square error (RMSE) and mean absolute error (MAE) demonstrated that the proposed CNN–RF hybrid model outperformed the individual CNN or RF models in terms of modeling capability [13].

Vignesh et al. (2023) collected daily PM_2.5 observational data (from January 2015 to December 2021) from the OpenAQ air quality database and implemented various machine learning algorithms, such as Linear Regression (LR), Decision Tree (DT), Gradient Boosting Regression (GBR), AdaBoost Regression (ABR), XGBoost (XGB), K-Nearest Neighbors (K-NN), Long Short-Term Memory (LSTM), Random Forest (RF), and Support Vector Machine (SVM), to predict PM_2.5 concentrations [14]. Zaini et al. (2022) proposed a hybrid deep learning model (EEMD-LSTM) to decompose the original sequence station data of particulate matter into several subseries and predict hour-ahead PM_2.5 concentrations. The performance of the hybrid model was impressive, achieving an R² of more than 90 percent [15]. Liu et al. (2021) [16] introduced a machine learning approach (Random Forest) to estimate ambient PM_2.5 concentrations, leveraging various meteorological and satellite-derived parameters as predictors. The study demonstrated a rigorous methodology in model development and evaluation, including feature selection techniques and cross-validation methods to ensure robustness and generalizability.

The findings from Liu et al. reveal promising results, with the proposed model achieving high accuracy in predicting PM_2.5 concentrations, offering valuable insights for air quality management strategies in the Gauteng Province, South Africa [16]. Singh et al. (2024) proposed an Integrated spatiotemporal graph neural network (ISTGNN) that effectively captures both spatial dependencies in road networks and temporal patterns in traffic data, demonstrating that integrating GNNs with temporal models significantly enhances forecasting accuracy [17]. Yu et al. (2023) combined spatiotemporal learning with Generative Adversarial Networks (GANs) to improve atmospheric nowcasting, showing that GAN-based frameworks can generate sharper and more realistic predictions in meteorological applications [18]. Similarly, Bentsen et al. (2023) introduced a unified graph-based formulation for wind forecasting, illustrating that representing meteorological fields as graphs enables more flexible modeling of spatial interactions [19]. Han et al. (2023) incorporated Ollivier–Ricci curvature into spatiotemporal GNNs to enhance traffic flow forecasting, highlighting the value of geometric graph features in strengthening model robustness [20]. Lastly, Zhu et al. (2024) developed STDNet, a spatiotemporal decomposition network for Arctic sea ice concentration prediction, demonstrating that decomposition-based frameworks effectively capture multi-scale temporal variability in environmental data. Collectively, these studies reinforce the growing trend toward using GNNs and hybrid spatiotemporal architectures for complex environmental prediction tasks, supporting their applicability to air quality forecasting, where strong spatial–temporal dependencies are also present [21].

Yu, Yin, and Zhu (2018) introduced one of the earliest unified spatiotemporal GNN frameworks, demonstrating that coupling graph convolutions with temporal convolutions can accurately model traffic dynamics across complex road networks [22]. Roy et al. (2021) advanced this idea by proposing USTGCN, a model that jointly aggregates spatial and temporal dependencies while incorporating recurring daily traffic patterns [23]. Ju et al. (2024) further expanded the modeling capacity of STGNNs through COOL, which uses heterogeneous graph construction and a self-attention decoder to capture diverse and evolving traffic relationships [24]. Ahn et al. (2024) improved short-term traffic speed prediction by integrating STGCN with CNN modules, enabling better extraction of local spatial features [25]. The study published in Complex and Intelligent Systems (2023) strengthened interpretability by fusing external knowledge-such as weather and points-of-interest—into a spatiotemporal graph neural network [26]. Ahn et al. (2023) introduced AASTGNet, which employs adaptive graph learning and attention mechanisms to dynamically reflect changing road conditions [27]. Liu, Shojaee, and Reddy (2023) proposed a hybrid graph neural ODE framework capable of modeling both short-term local traffic fluctuations and longer-range temporal dynamics [28].

Choi and Kim (2022) presented Rad-cGAN, a conditional GAN-based model that enhances precipitation nowcasting by generating realistic radar-based spatiotemporal forecasts [29]. Zhang, Song, Han, and Zhang introduced a GAN-powered remote sensing fusion method that improves the spatiotemporal consistency of satellite images, providing higher-quality inputs for earth observation and environmental monitoring tasks [30]. One of the challenges is that PM_2.5 levels often exhibit strong spatial and temporal dependencies. The pollution level at one location can be influenced by nearby locations, making it difficult for machine learning algorithms such as LSTM and CNN to yield optimal results when predicting PM_2.5 In addressing this predictive challenge, Graph Neural Networks (GNNs) emerge as a prime candidate. Unlike Convolutional Neural Networks and Recurrent Neural Networks, GNNs are suited for non-Euclidean data representations, such as networks of air monitoring stations. In this network, nodes symbolize the monitoring stations, while edges depict interactions between stations, collectively forming a graph structure. Both nodes and edges within the graph can possess distinct features, including meteorological attributes, measurements of other pollutant concentrations, and remote sensing spectral indices.

3. Data Acquisition and Methodological Framework

In this section, we discuss the two distinct study areas and describe the datasets used in the forecasting model to address the PM_2.5 concentration prediction problem. Since we are working with two different study areas and the datasets for each are not identical, we will describe the datasets for each area in the following subsections. Firstly, we describe the air monitoring data, which is used for training and validating the model. The air monitoring dataset includes measurements of various air pollutants, including PM_2.5, collected from multiple monitoring stations in each study area. This dataset serves as the primary input for our model and is essential for understanding the spatial and temporal variations in PM_2.5 concentrations.

The second dataset is the meteorological dataset, which supplements the air monitoring data. This dataset includes various meteorological parameters, such as temperature, humidity, wind speed, and wind direction, all of which can influence PM_2.5 levels. By incorporating meteorological data, we aim to improve the accuracy of our PM_2.5 concentration predictions by accounting for the effects of weather conditions on pollutant dispersion and transformation. Each subsection will provide detailed information on the data collection methods, sources, and preprocessing steps for both the air monitoring and meteorological datasets for the two study areas. This comprehensive approach ensures that the forecasting model is well informed by a diverse set of relevant factors, thereby enhancing its predictive capabilities for PM_2.5 concentrations.

3.1. Study Area

Switzerland, a landlocked country covering an area of 41,285 km², is one of the study areas in this research. It is bordered by Germany, France, Italy, Luxembourg, and Austria, with a maximum length of 220 km along the north–south axis and a width of 350 km extending from east to west. Switzerland is divided into three main geographic regions: the Swiss Alps, which occupy the southern and eastern parts of the country; the Jura mountains in the northwest; and the central plateau. Switzerland is known for having relatively low PM_2.5 concentrations compared to many other countries because it has strict environmental regulations and policies. The Swiss National Air Pollution Monitoring Network evaluates air quality at 16 sites across Switzerland, as shown in Figure 1. The different colors indicate the types of environments in which the sensor sites are located. These sites are strategically positioned to capture representative pollution levels across diverse settings, including urban roadsides, residential areas, and rural locations. This network provides a comprehensive overview of air quality across various geographic and demographic regions, contributing valuable data for our study on PM_2.5 concentration prediction.

South Africa, covering an area of 1,221,037 km², is also one of the study areas in this research. It is bordered by Botswana and Zimbabwe to the north, and Mozambique and Eswatini (formerly Swaziland) to the northeast and east. The country consists of nine provinces, but our study focuses on one province: Gauteng. Gauteng, in South Africa, is a busy province with a lot of industries and traffic, causing high levels of air pollution, including PM_2.5. The air quality also changes with the seasons due to temperature inversions, which trap pollution close to the ground. Studying PM_2.5 levels in Gauteng province can help understand pollution patterns in other cities with similar problems. The South African Air Quality Information Systems (SAAQIS) provides near real-time air quality data from over 70 monitoring stations. Figure 2 shows the locations of these monitoring stations across South Africa. However, due to data limitations, our analysis will concentrate on data from only eight stations located within Gauteng province. The different colors indicate the types of environments in which the sensor sites are located. This clarification has been added in Section 3.1, and a revised figure has been included to address the issues with overlapping markers and improve overall clarity. The gray color indicates the type of environment in which the sensor sites are located. Unfortunately, this source does not provide the specific site names, similar to the data available for Switzerland. These stations provide the most comprehensive and reliable data for our study, allowing us to accurately predict PM_2.5 concentrations in this region.

Switzerland, a high-income country with strict environmental policies and relatively low pollution levels, contrasts strongly with Gauteng, a heavily industrialized and densely populated region facing persistent air quality challenges. Examining these regions provides a valuable opportunity to understand how the varying urban settings and regulatory frameworks impact air pollution patterns, offering insights that could support the development of more flexible and internationally applicable air quality management approaches.

3.2. Air Quality Datasets

3.2.1. Switzerland Monitoring Stations

The datasets used in this study are sourced from the National Air Pollution Monitoring Network (NABEL), which collects hourly air pollution data from 16 monitoring stations. However, only 8 of these stations have gathered sufficient PM_2.5 concentration data (see Table 1) for the entire study period, which spans from 1 January 2016 to 31 December 2021. As a result, only these 8 monitoring stations were included in this study. The monitoring stations also record temperature, precipitation, and global radiation. This comprehensive dataset provides a solid foundation for analyzing PM_2.5 concentrations and their relationship with various environmental factors.

3.2.2. South Africa: Gauteng Province Monitoring Stations

The datasets used in this study are sourced from Open Meteo and the South African Air Quality Information Systems (SAAQIS), which collect hourly air pollution data from available monitoring stations. However, only 8 of these stations have gathered sufficient PM_2.5 concentration data (see Table 2) for the entire study period, which spans from 1 January 2016 to 16 November 2022. This dataset provides the necessary information for analyzing PM_2.5 concentrations and understanding their temporal and spatial variations within the specified timeframe.

3.2.3. Meteorological and Pollutant Concentration Data

Meteorological data, such as wind speed and wind direction, is essential for the model. This data was obtained from the open-sources historical weather API, OpenMeteo and SAAQIS, which provides data at an hourly sampling rate. Table 3 lists the features collected from this source. It is important to note that the list of features presented here differs from those provided in the PM_2.5-GNN paper by Zaini et al. (2022) [15]. Specifically, the Planetary Boundary Layer (PBL) height and K-index were not included in this study. This represents a limitation, as research has shown that the PBL height is related to the vertical dilution of pollutants and exhibits an inverse linear relationship with pollutant concentrations.

3.2.4. Remote Sensing Hyperspectral Imaging Data

In addition to meteorological and pollutant concentration data, we included remote-sensing spectral index data. These data were sourced from the MODIS satellite and sampled at a daily rate. However, because the satellite collects observations only every 1 to 2 days, this results in periodic data gaps, which were addressed during preprocessing to ensure temporal consistency in the dataset. The sampling frequency of the spectral data impacted the NABEL, OpenMeteo and SAAQIS datasets, requiring Z-score normalization. This technique transforms a dataset so that its features have a mean of 0 and a standard deviation of 1. This is achieved by subtracting the mean of each feature from its values and then dividing by the standard deviation. This was crucial because features in a dataset often have different scales, and standardization ensures that each feature contributes equally to the analysis or model training. As a result, our study includes 2192 daily samples from Switzerland and 2512 daily samples from Gauteng. The remote sensing spectral indices cover the area surrounding each station, although the exact resolution is unknown. The integration of these spectral indices enriches the dataset, providing additional context for analyzing and predicting PM_2.5 concentrations.

3.3. Data Processing

The meteorological pollutant concentration data were visualized to check for any significant gaps in the data. Aside from the stations that did not record certain pollutant concentrations, there were minimal instances of missing data. For the missing data, the cubic-spline interpolation method was used to fill in the gaps. This technique proved effective for handling missing values. Once the gaps were filled, all features were converted to SI units and resampled daily to match the remote sensing data. This pre-processing step ensured consistency across the datasets, enabling accurate analysis and modeling.

3.3.1. Remote Sensing Spectral Index

The remote sensing data were less reliable compared to the monitoring station and meteorological data, with many gaps and anomalies present in the datasets. For the remote sensing data, the linear interpolation method was applied, as the cubic-spline interpolation technique caused the data to become unstable at the endpoints. This step was essential because, even after addressing missing values, environmental time-series data may still exhibit abnormal spikes, sensor malfunctions, or unrealistic fluctuations that do not correspond to actual atmospheric conditions. To mitigate these issues, we employed an algorithm designed to identify outliers by examining both the temporal dynamics of each variable and its underlying statistical distribution. Measurements that deviated substantially from expected temporal patterns-such as isolated extreme peaks, negative pollutant concentrations, or values beyond physically plausible thresholds were flagged as anomalous. Once identified, these data points were removed or corrected using appropriate imputation strategies to maintain the coherence and continuity of the dataset. This additional cleaning process ensured that the final dataset used for model training was reliable, internally consistent, and free from distortions that could adversely affect the predictive performance of the ST-GNN model.

Given the large number of features used to predict our target, there was a risk of the model suffering from the curse of dimensionality. To address this, the BorutaShap feature selection method was employed to identify the most important remote sensing spectral indices for predicting PM_2.5 concentrations. The feature selection process highlighted the following indices as the most important: Fluorescence Correction Vegetation Index (FCVI), Green Atmospherically Resistant Vegetation Index (GARI), Normalized Multi-band Drought Index (NMBDI), Normalized Difference Vegetation Index, Aerosol-Free Vegetation Index (2100 nm), Green Leaf Index, and Global Vegetation Moisture Index, as shown in Equations (1)–(7), where N is Near-infrared (NIR) reflectance, R, G, B are Red, Green, and Blue reflectances, and

S_{1}

,

S_{2}

are Shortwave infrared bands (typically SWIR1 and SWIR2). These vegetation indices have shown a strong correlation with PM_2.5 concentrations, particularly the spectral bands in the visible and near-infrared regions.

FCVI = N - (\frac{(R + G + B)}{3.0})

(1)

GARI = \frac{(N - (G - (B - R)))}{(N - (G + (B - R)))}

(2)

NMBDI = \frac{(N - (S_{1} - S_{2}))}{(N + (S_{1} - S_{2}))}

(3)

NDVI = \frac{(N - R)}{(N + R)}

(4)

AFRI 2100 = \frac{(N - 0.5 S_{2})}{(N + 0.5 S_{2})}

(5)

GLI = \frac{(2.0 G - R - B)}{(2.0 G + R + B)}

(6)

GVMI = \frac{((N + 0.1) - (S_{2} + 0.02))}{((N + 0.1) + (S_{2} + 0.02))}

(7)

3.3.2. Preparation

Once the remote spectral indices were obtained, they were concatenated with the weather and pollutant concentration datasets for each station. This integration resulted in a comprehensive dataset that combined spectral, meteorological, and pollutant data, enabling a more robust analysis and prediction of PM_2.5 concentrations.

3.4. Methodology

The aim of this study is to forecast PM_2.5 concentrations up to 7 days ahead at a given monitoring station. One model that has proven effective for forecasting PM_2.5 concentrations is the PM_2.5-GNN, a hybrid model that combines a Graph Neural Network (GNN) and a Recurrent Neural Network (RNN). In this section, we provide a detailed description of the model used for predicting PM_2.5 concentrations. The PM_2.5-GNN model was originally proposed by Wang S. and Ling et al. in 2020 [31], demonstrating significant effectiveness in leveraging domain-specific sensitivity and capturing long-term dependencies-critical features for PM_2.5 forecasting.

In our study, we adapted the model and applied it to our dataset to investigate whether the inclusion of remote-sensing hyperspectral indices as node features, along with meteorological information and pollutant concentration data, could improve the prediction accuracy of PM_2.5 concentrations.

3.4.1. Model Overview

The PM_2.5-GNN is a hybrid model that combines a Graph Neural Network (GNN) and a Recurrent Neural Network (RNN) to model spatial dependencies and temporal dynamics, respectively.

This two-stage model represents the data as a graph, where the nodes correspond to ground-based sensors that monitor air quality, measuring various pollutants and meteorological data. The edges represent the interactions between these sensors. The GNN employs a message-passing paradigm, where nodes exchange information along the edges, iteratively aggregating and updating their representations. This process allows the model to understand spatial correlations and capture the horizontal transport of pollutants.

In the second stage, a Gated Recurrent Unit (GRU) is applied to the knowledge-enhanced GNN, effectively capturing the temporal dependencies in the data. By combining the GNN’s ability to model spatial dependencies with the GRU’s capability to handle temporal sequences, the PM_2.5-GNN model provides a robust framework for accurately predicting PM_2.5 concentrations. The inclusion of remote-sensing hyperspectral indices, meteorological data, and pollutant concentration data as features further enhances the model’s predictive power.

3.4.2. Model Architecture

The prediction of PM_2.5 concentration is framed as a spatiotemporal sequence prediction problem. We denote the PM_2.5 concentration at time step t as

X^{t} \in R^{N \times 1}

, where N is the number of monitoring stations measuring PM_2.5. A directed graph

G = (V, E, A)

is constructed, where V is the set of nodes representing the monitoring stations, A is the adjacency matrix determining the potential edges, and E is the set of edges representing the interactions between the monitoring stations.

We define

P^{t} \in R^{N \times p}

and

Q^{t} \in R^{M \times q}

as the feature matrices for the nodes and edges, respectively, at time step t, where p and q are the corresponding numbers of features, and

M = | E |

is the number of edges. The PM_2.5-GNN explicitly encodes domain knowledge such as meteorological and geographical information into the attribute matrices and the graph structure. Additionally, it leverages domain knowledge from the near future by incorporating meteorological information from weather forecasting services.

Formally, forecasting at any starting point t is performed by feeding the observed PM_2.5 concentrations

X^{t}

at the current time step, the next T steps of the attribute matrices

[P^{t}, \dots, P^{t + T}]

and

[Q^{t}, \dots, Q^{t + T}]

, and the graph structure G into the model. The framework for the PM_2.5 concentration prediction problem is presented in Equation (8), with an illustration provided in Figure 3.

[X^{t}; P^{t + 1}, . . ., P^{t + T}; Q^{t + 1}, . . ., Q^{t + T}; G] \overset{f (\cdot)}{\to} [{\hat{X}}^{t + 1}, . . ., {\hat{X}}^{t + T}]

(8)

3.4.3. Model Parameters

Hyperparameter tuning was performed using the Weights and Biases (WandB) platform, which assisted in identifying the optimal hyperparameters for the model. For the Switzerland datasets, the best hyperparameters were a batch size of 98, a weight decay of 0.0001, and a learning rate of 0.005. For the Gauteng, South Africa datasets, the optimal hyperparameters were a batch size of 184, a weight decay of 0.0248, and a learning rate of 0.0588. The differences in optimal hyperparameters between Switzerland and Gauteng reflect variations in the datasets. Switzerland’s lower pollution variability favors a smaller batch size and learning rate, while Gauteng’s more heterogeneous data requires a larger batch size and adjusted learning rate. These region-specific settings do not compromise the general applicability of the PM_2.5-GNN framework, as the architecture remains fully transferable and adapting hyperparameters to local data is standard practice in machine learning.

The model was run with two types of datasets: one consisting of meteorological and pollutant concentration data only, and the other combining meteorological data, pollutant concentrations, and remote-sensing spectral indices. The model was trained for 200 epochs, with an early stopping criterion set at 10 epochs to prevent overfitting. Optimizers such as Adam and SGD, along with loss functions such as SmoothL1 and L1Loss, were used for the Swiss and Gauteng datasets, respectively. This approach ensured that the model was finely tuned to the specific characteristics of each dataset, maximizing the accuracy of PM_2.5 concentration predictions.

3.5. Adaptation to Our Datasets

Nodes. Due to the limited number of monitoring stations in both Switzerland and Gauteng, we treated the monitoring stations themselves as the nodes. This approach differs from PM_2.5-GNN paper by Zaini et al. (2022) [15], where monitoring stations were averaged for each city. While we aimed to keep the node features as similar as possible to those in the original paper, we also incorporated additional features, such as pollutant concentrations and spectral indices. This addition provided a more detailed and comprehensive set of data for each node, thereby enhancing the model’s ability to predict PM_2.5 concentrations accurately.

Graph Construction. To adapt our dataset for use in the PM_2.5-GNN model, we needed to create a graph representation of the data. This was essential since we were working with a multivariate time series regression problem across multiple nodes. We constructed the graph using the geographical coordinates of the monitoring stations, specifically their longitude and latitude. By calculating the distance and direction between each pair of nodes, we ensured that each node was aware of the positions of other stations and their spatial relationships.

We also encoded these spatial relationships into the graph by using distance and direction as edge features. By using distance as an edge feature, we ensure that stations within a specific range are more likely to be connected, which better represents their potential to influence each other, especially regarding pollutant levels or other important factors. Several constraints were implemented to refine the graph structure and encode geographical information effectively. For instance, we used a distance threshold of 300 km to determine whether a station should be connected to another. Furthermore, we took the altitude of mountain ranges into account, applying a threshold of 1200 m—if mountain ranges exceeded this altitude between stations, they were considered disconnected. Setting a 300 km distance threshold is valuable for restricting the graph connections to stations that are geographically close enough to affect each other. This approach helps reduce unnecessary complexity in the graph and ensures that only meaningful spatial relationships are included. Additionally, we excluded interactions between stations that had an altitude difference greater than 450 m, ensuring that these stations were not connected in the graph. These conditions were incorporated into the adjacency matrix, which structured the spatial relationships and interactions between monitoring stations. This approach enabled the PM_2.5-GNN model to process both the geographical and pollutant data effectively, leading to better predictions.

3.6. Experiment

Since the aim of this study is to compare the prediction accuracy of PM_2.5 concentration with and without the use of spectral indices, the model was tested on two datasets. The first dataset served as the control, where the input features consisted of pollutant concentrations and weather data. In Table 4 the data were split into three subsets: training, validation, and test, as shown in the table below. The training datasets for both Gauteng and Switzerland spanned four years of data, which were used to train the model.

During the training process, the termination criterion was evaluated after each epoch. If the maximum number of epochs was reached or if the validation error did not improve for a specified period, the training process was terminated. In the final step, the model’s prediction accuracy was tested using the test dataset. This approach ensures that the model is trained on a comprehensive dataset and that the results are reliable for predicting PM_2.5 concentrations under different input conditions. The comparison of prediction accuracy between the two datasets will allow us to assess the impact of incorporating remote-sensing spectral indices on the model’s performance.

The model’s performance is evaluated using several metrics. The commonly used root mean squared error (RMSE) and mean absolute error (MAE), as shown in Equation (9),

\begin{matrix} RMSE & = \sqrt{\frac{\sum_{n}^{i = 1} ((y_{i} - {\hat{y}}_{i})}{n}} \\ MAE & = \frac{\sum_{n}^{i = 1} | y_{i} - {\hat{y}}_{i} |}{n} \end{matrix}

(9)

where

y_{i}

and

{\hat{y}}_{i}

are the ground truth and the prediction, respectively, and n is the total number of observed data samples, are employed to test accuracy in time series prediction. In addition to these standard metrics, domain-specific meteorological metrics are used to assess the model’s performance near the pollution threshold. The threshold for both Switzerland and Gauteng is set at 10

μ g / m^{3}

. These additional metrics include the probability of detection (POD), critical success index (CSI), and false alarm rate (FAR).

By utilizing these metrics, we can comprehensively assess the model’s accuracy and its effectiveness in predicting PM_2.5 concentrations, particularly near critical pollution thresholds. This dual approach ensures that the model’s predictions are not only statistically sound but also practically relevant for air quality monitoring and public health advisories. The hits are calculated as (prediction = 1, truth = 1), misses (prediction = 0, truth = 1), false alarms (prediction = 1, truth = 0) and CSI, FAR and POD are calculated by Equation (10).

\begin{matrix} CSI & = \frac{hits}{hits + misses + false alarms} \\ FAR & = \frac{false alarms}{hits + false alarms} \\ POD & = \frac{hits}{hits + misses} \end{matrix}

(10)

4. Results

The performance of the model is calculated by averaging the 28 features. We compare these metrics in Table 5 and Table 6 for the two datasets. As observed, for all metrics, the datasets incorporating spectral indices outperform those without them. This demonstrates that the inclusion of hyperspectral indices enhances the prediction accuracy of PM_2.5 concentration.

Figure 4 and Figure 5 illustrate the daily actual and predicted PM_2.5 concentrations for the selected Switzerland stations, as determined by testing the MP and MSP datasets. Similarly, Figure 6 and Figure 7 show daily actual and predicted PM_2.5 concentrations for the selected Gauteng stations, based on the MP and MSP datasets. For the 1-day ahead forecast, the dataset without spectral indices exhibits a lower error. However, as the prediction length increases, the dataset with spectral indices begins to outperform the other, starting from the 2-day ahead forecast. This highlights that the inclusion of spectral indices improves prediction performance over longer periods.

5. Conclusions

This study aimed to enhance the prediction of PM_2.5 concentrations by integrating remote-sensing spectral indices with traditional meteorological and pollutant data using spatiotemporal graph neural networks (ST-GNNs). The findings demonstrate that incorporating hyperspectral indices significantly improves the accuracy of PM_2.5 forecasts. The performance of the proposed model was evaluated using RMSE, MAE, POD, CSI, and FAR. The results consistently showed that the dataset including spectral indices outperformed the dataset without them across all metrics. For Switzerland, the model integrating spectral indices achieved an RMSE of 1.4591, compared to 1.4660 without the indices. Similarly, MAE improved from 1.1147 to 1.1053, CSI increased from 0.8345 to 0.8387, POD improved from 0.8961 to 0.8972, and FAR decreased from 0.0760 to 0.0719. In Gauteng, South Africa, the improvements were also notable. RMSE decreased from 6.3486 to 6.2319, MAE from 4.4891 to 4.4066, CSI from 0.9555 to 0.9560, and POD from 0.9699 to 0.9732. However, FAR slightly increased from 0.0154 to 0.0181. In real-life situations, especially in public alert systems like weather warnings or disaster alerts, even a small increase in FAR can cause problems. If the FAR is too high, people might stop taking alerts seriously and may ignore real warnings. This can be risky, as it could lead to slow responses or no action at all during an actual emergency, putting lives in danger. Error analysis over the prediction length indicated that while the initial one-day-ahead forecast without spectral indices had a lower error, the dataset with spectral indices outperformed from the two-day-ahead forecast onwards. This suggests that spectral indices provide a more robust prediction for longer forecasting periods, particularly as the error stabilizes over time for Swiss monitoring stations.

Compared with the results reported in Ref. [31], the PM_2.5 GNN achieves substantially better performance for both Gauteng and Switzerland, even though several important features could not be included due to limitations in the available data sources. The study successfully demonstrated that integrating remote-sensing hyperspectral indices with traditional meteorological and pollutant data improves the accuracy of PM_2.5 concentration forecasts. Furthermore, by evaluating the proposed spatial-temporal graph neural network (ST-GNN) framework in regions with distinct environmental characteristics, such as Switzerland, with its strict regulatory context and low levels of pollution, and Gauteng, a densely populated and industrialized area with persistent challenges to air quality, we confirmed the ability of the model to generalize in spatially diverse settings.

Although the magnitude of overall improvements in standard metrics (Table 5 and Table 6) appears modest, it is important to highlight that the inclusion of spectral indices may provide more substantial benefits for rare or extreme pollution events, which are particularly critical for public health and emergency response. Accurate forecasting of such events can significantly enhance early warning systems and targeted interventions, even if improvements in average metrics are small. The study successfully demonstrated that integrating remote-sensing hyperspectral indices with traditional meteorological and pollutant data significantly improves the accuracy of PM_2.5 concentration forecasts. Furthermore, by evaluating the proposed spatial-temporal graph neural network (ST-GNN) framework in regions with distinct environmental characteristics, such as Switzerland, with its strict regulatory context and low levels of pollution, and Gauteng, a densely populated and industrialized area with persistent challenges to air quality, we confirmed the ability of the model to generalize in spatially diverse settings. These findings underscore the potential of ST-GNNs not only for accurate and robust air quality prediction but also for broader application in international contexts. Despite certain limitations, the results provide a strong foundation for developing flexible and transferable forecasting tools, with significant implications for public health, policy-making, and environmental management.

Limitations

Several limitations were identified in this study. Firstly, the exclusion of PBL height and K-index from the Swiss and South African studies could have influenced the predictions, as these parameters have been shown to have significant correlations with pollutant levels in other research. Additionally, the lack of terrain data in the South African graph construction may have affected the spatial interactions modeled in the Graph Neural Network (GNN). This was further compounded by the insufficient data for these parameters, which likely impacted the overall accuracy of the model. Another challenge was the high percentage of missing data for the remote sensing spectral indices, which led to uncertainties in selecting the final input features. This study also did not include Aerosol Optical Depth (AOD) due to its low spatial coverage and temporal resolution, which is often considered a crucial feature in air quality modeling.

6. Future Work

Future research should focus on collecting more extensive data from additional provinces in South Africa, as well as from other countries such as the USA and Brazil, to examine the impact of dataset size and diversity on GNN performance. Expanding the data collection could help generalize the findings and enhance the robustness of the model across different geographic and environmental contexts.

Author Contributions

K.M., V.C. and C.R. contributed to the original draft preparation, methodology, formal analysis, software development, and data visualization. C.M. contributed to modelling and the generation of plots, as well as manuscript review and editing. P.B. was responsible for validation and manuscript review and editing. M.A.M., I.A., T.M., N.B., J.K. and M.K. contributed to manuscript review and editing. B.M. and E.K.N. provided supervision and contributed to manuscript review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by Canada’s International Development Research Centre (IDRC) (Grant No. 109981-001). J.K. acknowledges support from the NSERC Discovery Grant (Grant No. RGPIN-2022-04559), the NSERC Discovery Launch Supplement (Grant No. DGECR-2022-00454), and the New Frontiers in Research Fund-Exploratory (Grant No. NFRFE-2021-00879).

Data Availability Statement

The air quality datasets analysed in this study are publicly available from the following sources. South African air pollution data were obtained from the South African Air Quality Information System (SAAQIS), an online portal providing near real-time and historical air quality data from monitoring stations across South Africa, including hourly PM_2.5 measurements (available at https://saaqis.environment.gov.za/ (accessed on 25 September 2025)). In addition, global air quality data can be accessed through platforms such as OpenAQ, which aggregates and harmonizes data from ground-level monitoring stations including those in South Africa (see https://openaq.org/ (accessed on 25 September 2025)) :contentReference[oaicite:1]index=1. Swiss air quality measurements were sourced from the National Air Pollution Monitoring Network (NABEL), which provides real-time and historical data on principal pollutants at multiple locations across Switzerland (data access and query available at https://www.bafu.admin.ch/en/air-quality-data (accessed on 25 September 2025)). These datasets are publicly archived and openly accessible. If additional processed data were generated in the course of this study, they are available from the corresponding author upon reasonable request.

Acknowledgments

We would also like to thank and acknowledge the support provided by Sibanye-Stillwater through the Sibanye-Stillwater Digital Mining Laboratory (DigiMine) and the Wits Mining Institute (WMI).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Van der Walt, P.A.; Garl, R.M.; Burger, R.P.; Naidoo, M. Impacts of Coal-Fired Power Plants on Aerosol Particles in the Highveld. Master’s Thesis, North-West University, Potchefstroom, South Africa, 2023. [Google Scholar]
OECD, IEA. Energy and Air Pollution: World Energy Outlook Special Report 2016; IEA: Paris, France, 2016. [Google Scholar]
Bartington, S.; William, A. Prevalence of Health Impacts Related to Exposure to Poor Air Quality Among Children in Low and Lower Middle-Income Countries; Institute of Development Studies: Brighton, UK, 2020. [Google Scholar]
Xing, Y.F.; Xu, Y.H.; Shi, M.H.; Lian, Y.X. The impact of PM_2.5 on the human respiratory system. J. Thorac. Dis. 2016, 8, E69. [Google Scholar]
Manisalidis, I.; Stavropoulou, E.; Stavropoulos, A.; Bezirtzoglou, E. Environmental and health impacts of air pollution: A review. Front. Public Health 2020, 8, 14. [Google Scholar] [CrossRef] [PubMed]
Xu, J.; Hu, W.; Liang, D.; Gao, P. Photochemical impacts on the toxicity of PM_2.5. Crit. Rev. Environ. Sci. Technol. 2022, 52, 130–156. [Google Scholar] [CrossRef]
Masood, A.; Hameed, M.M.; Srivastava, A.; Pham, Q.B.; Ahmad, K.; Razali, S.F.; Baowidan, S.A. Improving PM_2.5 prediction in New Delhi using a hybrid extreme learning machine coupled with snake optimization algorithm. Sci. Rep. 2023, 13, 21057. [Google Scholar] [CrossRef] [PubMed]
Masoud, A.A. Spatio-temporal patterns and trends of the air pollution integrating MERRA-2 and in situ air quality data over Egypt (2013–2021). Air Qual. Atmos. Health 2023, 16, 1543–1570. [Google Scholar] [CrossRef] [PubMed]
World Health Organization. Ambient Air Pollution: A Global Assessment of Exposure and Burden of Disease; WHO: Geneva, Switzerland, 2016. [Google Scholar]
Badicu, A.; Suciu, G.; Balanescu, M.; Dobrea, M.; Birdici, A.; Orza, O.; Pasat, A. PMs concentration forecasting using ARIMA algorithm. In Proceedings of the 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 25–28 May 2020; pp. 1–5. [Google Scholar]
Morapedi, T.D.; Obagbuwa, I.C. Air pollution particulate matter (PM_2.5) prediction in South African cities using machine learning techniques. Front. Artif. Intell. 2023, 6, 1230087. [Google Scholar] [CrossRef]
Chae, S.; Shin, J.; Kwon, S.; Lee, S.; Kang, S.; Lee, D. PM₁₀ and PM_2.5 real-time prediction models using an interpolated convolutional neural network. Sci. Rep. 2021, 11, 11952. [Google Scholar] [CrossRef]
Chen, M.H.; Chen, Y.C.; Chou, T.Y.; Ning, F.S. PM_2.5 Concentration Prediction Model: A CNN–RF Ensemble Framework. Int. J. Environ. Res. Public Health 2023, 20, 4077. [Google Scholar] [CrossRef]
Vignesh, P.P.; Jiang, J.H.; Kishore, P. Predicting PM_2.5 concentrations across USA using machine learning. Earth Space Sci. 2023, 10, e2023EA002911. [Google Scholar] [CrossRef]
Zaini, N.A.; Ean, L.W.; Ahmed, A.N.; Abdul Malek, M.; Chow, M.F. PM_2.5 forecasting for an urban area based on deep learning and decomposition method. Sci. Rep. 2022, 12, 17565. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, D.; Wang, W.; Zhu, Q.; Bi, J.; Scovronick, N.; Naidoo, M.; Garland, R. A machine learning model to estimate ambient PM_2.5 concentrations in industrialized highveld region of South Africa. In Proceedings of the AGU Fall Meeting 2021, New Orleans, LA, USA, 13–17 December 2021; Volume 2021, p. A35G-1710. [Google Scholar]
Singh, V.; Sahana, S.K.; Bhattacharjee, V. Integrated Spatio-Temporal Graph Neural Network for Traffic Forecasting. Appl. Sci. 2024, 14, 11477. [Google Scholar] [CrossRef]
Yu, W.; Wang, S.; Zhang, C.; Chen, Y.; Sheng, X.; Yao, Y.; Liu, J.; Liu, G. Integrating Spatio-Temporal and Generative Adversarial Networks for Enhanced Nowcasting Performance. Remote Sens. 2023, 15, 3720. [Google Scholar] [CrossRef]
Bentsen, L.Ø.; Warakagoda, N.D.; Stenbro, R.; Engelstad, P. A Unified Graph Formulation for Spatio-Temporal Wind Forecasting. Energies 2023, 16, 7179. [Google Scholar] [CrossRef]
Han, X.; Zhu, G.; Zhao, L.; Du, R.; Wang, Y.; Chen, Z.; Liu, Y.; He, S. Ollivier–Ricci Curvature Based Spatio-Temporal Graph Neural Networks for Traffic Flow Forecasting. Symmetry 2023, 15, 995. [Google Scholar] [CrossRef]
Zhu, X.; Wang, J.; Wang, G.; Jiang, Y.; Sun, Y.; Zhao, H. STDNet: Spatio-Temporal Decompose Network for Predicting Arctic Sea Ice Concentration. Remote Sens. 2024, 16, 4534. [Google Scholar] [CrossRef]
Yu, B.; Yin, H.; Zhu, Z. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the IJCAI 2018, Stockholm, Sweden, 13–19 July 2018. [Google Scholar]
Roy, A.; Roy, K.K.; Ali, A.A.; Amin, M.A.; Rahman, A.K.M.M. Unified Spatio-Temporal Modeling for Traffic Forecasting using Graph Neural Network. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021. [Google Scholar]
Ju, W.; Zhao, Y.; Qin, Y.; Yi, S.; Yuan, J.; Xiao, Z.; Luo, X.; Yan, X.; Zhang, M. COOL: A Conjoint Perspective on Spatio-Temporal Graph Neural Network for Traffic Forecasting. Inf. Fusion 2024, 107, 102341. [Google Scholar] [CrossRef]
Jeon, S.B.; Jeong, M.-H. Integrating Spatio-Temporal Graph Convolutional Networks with Convolutional Neural Networks for Predicting Short-Term Traffic Speed in Urban Road Networks. Appl. Sci. 2024, 14, 6102. [Google Scholar] [CrossRef]
Zhou, Y.; Liu, Y.; Ning, N.; Wang, L.; Zhang, Z.; Gao, X.; Lu, N. Integrating Knowledge Representation into Traffic Prediction: A Spatial–Temporal Graph Neural Network with Adaptive Fusion Features. In Complex & Intelligent Systems; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
Huo, Y.; Zhang, H.; Tian, Y.; Wang, Z.; Wu, J.; Yao, X. A Spatiotemporal Graph Neural Network with Graph Adaptive and Attention Mechanisms for Traffic Flow Prediction. Electronics 2024, 13, 212. [Google Scholar] [CrossRef]
Liu, Z.; Shojaee, P.; Reddy, C.K. Graph-based Multi-ODE Neural Networks for Spatio-Temporal Traffic Forecasting. arXiv 2023, arXiv:2305.18687. [Google Scholar]
Choi, S.; Kim, Y. Rad-cGAN v1.0: Radar-based precipitation nowcasting model with conditional generative adversarial networks for multiple dam domains. Geosci. Model Dev. (GMD) 2022, 15, 5967–5985. [Google Scholar] [CrossRef]
Zhang, H.; Song, Y.; Han, C.; Zhang, L. Remote Sensing Image Spatiotemporal Fusion Using a Generative Adversarial Network. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4273–4286. [Google Scholar] [CrossRef]
Wang, S.; Li, Y.; Zhang, J.; Meng, Q.; Meng, L.; Gao, F. PM_2.5-GNN: A domain knowledge enhanced graph neural network for PM_2.5 forecasting. In Proceedings of the 28th International Conference on Advances in Geographic Information Systems, Virtual, 3–6 November 2020. [Google Scholar]

Figure 1. Switzerland measurement stations’ locations and types.

Figure 2. South Africa (Gauteng) measurement stations locations and types.

Figure 3. Illustration of the

P M_{2.5}

-GNN model process.

Figure 3. Illustration of the

P M_{2.5}

-GNN model process.

Figure 4. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Switzerland stations obtained after testing the MSP dataset, showing 1-day ahead forecasts.

Figure 4. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Switzerland stations obtained after testing the MSP dataset, showing 1-day ahead forecasts.

Figure 5. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Switzerland stations obtained after testing the MP dataset, showing 1-day ahead forecasts.

Figure 5. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Switzerland stations obtained after testing the MP dataset, showing 1-day ahead forecasts.

Figure 6. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Gauteng stations obtained after testing the MSP dataset, showing 1-day ahead forecasts.

Figure 6. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Gauteng stations obtained after testing the MSP dataset, showing 1-day ahead forecasts.

Figure 7. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Gauteng stations obtained after testing the MP dataset, showing 1-day ahead forecasts.

Figure 7. Daily actual and predicted

P M_{2.5}

concentrations for the best- and worst-performing Gauteng stations obtained after testing the MP dataset, showing 1-day ahead forecasts.

Table 1. Switzerland monitoring stations with their coordinates..

Station	Local ID	Longitude	Latitude
Basel: Binningen	BAS	7.58	47.54
Bern: Bollwerk	BER	7.44	46.95
Dübendorf: Empa	DUE	8.61	47.40
Härkingen: A1	HAE	7.82	47.31
Lugano: Università	LUG	8.96	46.01
Magadino: Cadenazzo	MAG	8.93	46.16
Payerne	PAY	6.94	46.81
Zürich: Kaserne	ZUE	8.53	47.38

Table 2. Gauteng monitoring stations with their coordinates.

Station	Local ID	Longitude	Latitude
Diepkloof	DIEP	27.957	−26.251
Etwatwa	ETWA	28.475	−26.117
Jabavu	JAB	27.872	−26.253
Kliprivier	KLIP	28.084	−26.421
Olifantsfontein	OLI	28.236	−25.974
Rosslyn	ROS	28.095	−25.625
Vanderbijilpark	VAN	27.817	−26.689
Bedfordview	BED	28.133	−26.179

Table 3. Features categorized into meteorology and pollutants.

Meteorology	Pollutants ( $μ g / m^{3}$ )
Temperature (degrees (°))	Sodium Dioxide (SO₂)
Ambient Relative Humidity (%)	Nitrogen Dioxide (NO₂)
Solar Radiation (W/m²)	Nitrogen Monoxide (NO)
Ambient Pressure (Pascal)	Combined NO and NO₂ (NO_x)
Rain (mm)	Ozone (O₃)
Longitude (degrees (°))	Particulate matter $\leq 10 μ m$ $(P M_{10}$ )
Latitude (degrees (°))	Particulate matter $\leq 2.5 μ m$ $(P M_{2.5}$ )
Altitude (km)
Ambient wind speed (m/s)
Ambient wind direction (degrees (°))

Table 4. Data splitting into three datasets: train, validation, and test.

Country	Train	Validation	Test
Gauteng (SA)	Jan 2016–June 2020	July 2020–June 2021	Nov 2021–Nov 2022
Switzerland	Jan 2016–Dec 2019	Jan 2020–June 2020	Jan 2022–June 2022

Table 5. Metrics from the experiment showing the prediction results of the model comparing the two Switzerland datasets. The values represent the overall performance averaged over all stations, and the best results are highlighted in bold.

Metric	Meteo and Pollutants Only (MP)	Meteo Spectral Pollutants (MSP)
RMSE	1.4661 ± 0.0188	1.4593 ± 0.0093
MAE	1.1146 ± 0.0190	1.1052 ± 0.0101
CSI	0.8344 ± 0.0054	0.8386 ± 0.0063
POD	0.8960 ± 0.0113	0.8970 ± 0.0108
FAR	0.0761 ± 0.0076	0.0718 ± 0.0100

Table 6. Metrics from the experiment showing the prediction results of the model comparing the two Gauteng datasets. The values represent the overall performance averaged over all stations, and the best results are highlighted in bold.

Metric	Meteo and Pollutants Only (MP)	Meteo Spectral Pollutants(MSP)
RMSE	6.3480 ± 0.0005	6.2315 ± 0.0001
MAE	4.4881 ± 0.0001	4.4056 ± 0.0002
CSI	0.9545 ± 0.0006	0.9550 ± 0.0005
POD	0.9599 ± 0.0008	0.9631 ± 0.0009
FAR	0.0156 ± 0.0008	0.0180 ± 0.0006

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chabalala, V.; Rudolph, C.; Mosala, K.; Nkadimeng, E.K.; Mosomane, C.; Mathaha, T.; Basu, P.; Mahboob, M.A.; Kong, J.; Bragazzi, N.; et al. Spatiotemporal Graph Neural Networks for PM_2.5 Concentration Forecasting. Air 2026, 4, 2. https://doi.org/10.3390/air4010002

AMA Style

Chabalala V, Rudolph C, Mosala K, Nkadimeng EK, Mosomane C, Mathaha T, Basu P, Mahboob MA, Kong J, Bragazzi N, et al. Spatiotemporal Graph Neural Networks for PM_2.5 Concentration Forecasting. Air. 2026; 4(1):2. https://doi.org/10.3390/air4010002

Chicago/Turabian Style

Chabalala, Vongani, Craig Rudolph, Karabo Mosala, Edward Khomotso Nkadimeng, Chuene Mosomane, Thuso Mathaha, Pallab Basu, Muhammad Ahsan Mahboob, Jude Kong, Nicola Bragazzi, and et al. 2026. "Spatiotemporal Graph Neural Networks for PM_2.5 Concentration Forecasting" Air 4, no. 1: 2. https://doi.org/10.3390/air4010002

APA Style

Chabalala, V., Rudolph, C., Mosala, K., Nkadimeng, E. K., Mosomane, C., Mathaha, T., Basu, P., Mahboob, M. A., Kong, J., Bragazzi, N., Atif, I., Kumar, M., & Mellado, B. (2026). Spatiotemporal Graph Neural Networks for PM_2.5 Concentration Forecasting. Air, 4(1), 2. https://doi.org/10.3390/air4010002

Article Menu