Next Article in Journal
Sustainable Land-Use Policy: Land Price Circuit Breaker
Previous Article in Journal
The Impact of China’s National Modern Agricultural Industrial Parks on Fertilizer Use from the Perspective of Food Security
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Hybrid SSA-VMD-GRU Model for Real-Time Traffic-Related Air Quality Index Prediction: Development and Validation

by
Wenzhe Huang
1,
Xiaoping Huang
2,*,
Yaqiong Zhang
2 and
Haoming Zhu
3
1
School of Energy and Environment, City University of Hongkong, Hong Kong 999077, Hong Kong
2
School of Electrical and Electronic Engineering, Guilin University of Technology, Guilin 541004, China
3
College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(24), 11233; https://doi.org/10.3390/su172411233
Submission received: 23 October 2025 / Revised: 22 November 2025 / Accepted: 8 December 2025 / Published: 15 December 2025
(This article belongs to the Section Sustainable Engineering and Science)

Abstract

Rapid urbanization has exacerbated traffic congestion and associated vehicle emissions, making real-time air quality index (AQI) prediction crucial for urban environmental management. Transportation emissions, including exhaust gases and particulate matter, are the main factors causing urban environmental pollution. Vehicle emission-induced air pollution related to transportation affects public health, quality of life, and well-being on a global scale and impacts socioeconomic development and people’s livelihoods. The air quality index (AQI) is a comprehensive indicator reflecting the degree of air pollution. Understanding the pollution level in a specific area can help decision-makers manage traffic flow, reduce congestion and emissions, and improve traffic efficiency and environmental sustainability. Traditional prediction methods often have problems such as low accuracy and an inability to effectively handle complex data. Therefore, this paper explores a traffic air quality index prediction model based on the sparrow search algorithm (SSA)–variational mode decomposition (VMD)–gated recurrent unit algorithm (GRU) model, based in deep learning. Experimental results on real-world datasets demonstrate that the SSA-VMD-GRU model reduces the mean absolute percentage error (MAPE) by approximately 8% compared to the standalone GRU model, offering a robust solution for real-time AQI forecasting and practical insights for current urban traffic air quality index monitoring methods.

1. Introduction

With globalization and economic development, urbanization is accelerating, leading to a concentration of population and economic activities in cities. The growth in urban population size has increased the demand for transportation, resulting in a significant rise in vehicle ownership and increasingly prominent traffic congestion issues [1]. Vehicle emissions and the release of pollutants such as particulate matter from increased traffic have become one of the major causes of urban environmental pollution. The air quality index (AQI) is used to describe the degree of air pollution. Early prediction models were traditionally based on statistics, including regression models, autoregressive moving average models, Markov models, and grey theory models. Machine learning enables computers to extract features from vast amounts of data and make predictions and decisions based on these features. In machine learning-based air quality prediction models, radial basis kernel functions have been used to train support vector machine (SVM) models for predicting urban air quality focused on densely populated cities [2]. In [3], random forests, SVMs, and regression models were used to explore an SVM model with hyperparameter tuning. An air quality prediction algorithm based on an improved pathfinder algorithm and sample entropy-optimized extreme learning machine (ELM) was proposed [4]. In [5], SVM was used to predict AQI combined with influencing factors such as land use, economic conditions, urban topography, and population distribution. Air quality in Shanghai was predicted using decision trees, SVMs, and BP neural networks [6]. An ELM-based AQI prediction model, incorporating the LIME (Local Interpretable Model-agnostic Explanations) model for feature selection, was proposed to improve accuracy [7]. An LSTM model was used to demonstrate that RNN models perform better in air quality prediction by comparing multiple linear regression models with RNN networks [8]. An LSTM-based air quality prediction model was introduced [9]. An LSTM model was built to predict the AQI in Lanzhou, China, with experimental results showing the model’s effectiveness [10]. A multi-time, multi-site air quality prediction model for Beijing in China was established using a spatiotemporal clustering-based deep learning network [11]. The Community Multiscale Air Quality Modelling System (CMAQ) was improved using a CNN model [12]. An improved temporal convolutional network (TCN) for air quality prediction was proposed [13]. PM2.5 and PM10 concentrations were predicted for Lanzhou in China using an autoregressive moving average model and a TCN model [14]. In [15], a Spatial–Temporal Similarity LSTM (STS-LSTM) model was proposed based on LSTM. In [16], a WOA-BiLSTM model was introduced. In [17], a deep belief network–extreme learning machine (DBN-ELM) was proposed based on cross-recurrence quantification analysis (CRQA). In [18], an LSTM model was used to correct the error series of ARIMA time series predictions, and it demonstrated that the ARIMA-LSTM hybrid model outperforms single models. An AQI prediction model based on an improved Aquila optimizer (IAO) hybrid kernel extreme learning machine (HKELM) was proposed [19]. Combining a convolutional neural network (CNN) model and long short-term memory networks (LSTMs) was introduced [20]. The AGA-LSTM model was analyzed to predict PM2.5 concentrations [21]. In [21], an integrated deep learning model was proposed based on convolutional neural networks and gated recurrent units (CNN-GRU) for predicting AQI, and this was compared with a multiverse optimizer-optimized generalized regression neural network (MVO-GRNN) and a genetic algorithm-optimized BP neural network (GA-BP) [22]. However, there is a problem of high computational cost. Experimental results showed that the CNN-GRU model has the smallest prediction error for AQI [22,23]. An AQI prediction model was proposed based on CNN-LSTM-KAN hybrid modelling, but it has a high computational cost [24]. Since the air quality index (AQI) is a non-linear time series model, it is susceptible to various factors such as atmospheric environment and meteorological conditions. In numerous published studies, the proposed prediction models have obvious limitations. These models are prone to falling into local optimal solutions, resulting in unreasonable deviations in the prediction of traffic pollutant concentrations.
Therefore, this paper uses grey correlation analysis and other methods to analyze the correlation between monitored data and AQI, screening out feature parameters; it also integrates and optimizes SSA-VMD-GRU to construct a hybrid AQI prediction model for traffic-related air quality, with the hybrid model integrating SSA for global parameter optimization, VMD for adaptive signal decomposition, and GRU for efficient temporal modelling to achieve superior accuracy and computational efficiency for real-time AQI prediction, overcoming the limitations of existing models. Compared with the Random Forest, XGBoost, EMD-LSTM, and CNN-GRU models, the SSA-VMD-GRU algorithm model demonstrates superior performance in both accuracy and computational efficiency. Its training speed is increased by more than 30% compared with algorithms such as CNN-GRU, EMD-LSTM, and VMD-BiLSTM; the inference speed meets the real-time requirements; and it is easier to deploy to edge devices such as traffic monitoring terminals. By conducting a comprehensive validation using multi-quarter meteorological data, the robustness of the SSA-VMD-GRU model in various time patterns was demonstrated.

2. Methods

The AQI is a comprehensive indicator used to represent air quality, with a value range of 0 to 500. The AQI is influenced by various factors. Fine particulate matter (PM2.5) and inhalable particulate matter (PM10) are among the major air pollutants [25,26]. Pollutants such as SO2 and carbon monoxide (CO) emitted from industrial activities and vehicle exhaust also contribute to elevated AQI levels [27]. Additionally, meteorological conditions like air temperature, humidity, and wind speed significantly impact the dispersion and accumulation of air pollutants. For instance, low wind speed and temperature inversion conditions can hinder the diffusion of air pollutants, potentially leading to an increase in AQI [28,29]. Therefore, investigating the influencing factors of AQI is crucial for its future prediction.

2.1. Theoretical Analysis of Influencing Factors

AQI is a dimensionless index that quantitatively describes the status of air quality [30]. For individual pollutants, it also stipulates sub-indices of air quality. There are six pollutants that directly affect the AQI value, namely PM2.5, PM10, NO2, CO, O3, and SO2 [31,32]. The AQI calculation formula is as follows in Equation (1) [33].
I = I h i g h I l o w C h i g h C l o W   ( C C l o w ) + I l o w
Here, I represents the numerical value of the air quality index. C denotes the pollutant concentration, and Chigh, Clow, Ihigh, and Ilow are all constants. Chigh refers to the concentration limit greater than or equal to C, while Clow refers to the concentration limit less than or equal to C. Ilow corresponds to the index limit for Clow, and Ihigh corresponds to the index limit for Chigh.
In this paper, in addition to considering the impact of air pollutant concentrations on AQI values, the influence of meteorological factors is also taken into account. Theoretically, wind speed and direction play crucial roles in the dispersion and transport of air pollutants. Higher wind speeds can help rapidly dilute and disperse pollutants, thereby reducing the AQI [34,35]. Temperature affects atmospheric stability; moderate temperatures can promote the mixing and dispersion of pollutants, contributing to a decrease in AQI. Relative humidity influences the moisture content of particulate matter in the air and the rate of chemical reactions. Higher humidity may reduce the concentration of particulate matter and slow down chemical reactions, potentially aiding in lowering the AQI. Under high atmospheric pressure, the air tends to be more stable, which hinders the dispersion of pollutants and may lead to an increase in AQI.

2.2. Correlation Analysis of Grey Association

Grey relational analysis is an analytical method used to study the degree of correlation between variables, and it is employed to analyze the closeness between parent factors and child factors [36]. Its advantages lie in multi-objective decision-making, multi-approach methods, and flexible and mobile handling attitudes [37]. The analysis steps are as follows:
Step 1: Set each data sequence as a matrix to construct an evaluation system (2).
X 1 , X 2 , X 3 , X n = ( X 1 ( 1 ) X 2 ( 1 ) X n ( 1 ) X 1 ( 2 ) X 2 ( 2 ) X n ( 2 ) X 1 ( m ) X 2 ( m ) X n ( m ) )
Step 2: Select the reference sequence and use it as a criterion for comparison with other sequences. The appropriate reference sequence can be chosen by determining the purpose of Equation (3).
X 1 = [ X 1 ( 1 ) , X 1 ( 2 ) , X 1 ( 3 ) ]
Step 3: Due to the different dimensions among various sequence data, for instance, the dimension of PM2.5 is μg/m3, while that of CO is mg/m3; a direct analysis would not lead to accurate conclusions. Therefore, dimensionless processing of the data sequences is necessary, as shown in Equation (4).
X ¯ i ( k ) = x ¯ i ( k ) x i ( 1 )
where k = 1, 2, 3, …, m. i = 1, 2, 3, …, n. After dimensionless transformation, the data matrix can be obtained as Equation (5).
X 1 , X 2 , X 3 , , X n = ( X 11 X 12 X 1 n X 21 X 22 X 2 n X n 1 X n 2 X n n )
Step 4: Calculate the absolute difference between the reference sequence and the comparison sequence according to Equation (6).
Δ i ( k ) = | X 1 ( k ) X i ( k ) |
From Equation (6), k = 1, 2, 3, …, m. i = 1, 2, 3, …, n. Compare the number of data sequences. Then, determine the maximum value M and the minimum value L of the absolute difference in the corresponding indicators:
M = m a x i = 1 n   m a x k = 1 m | X 1 ( k ) X i ( k ) |
L = m i n i = 1 n   m i n k = 1 m | X 1 ( k ) X i ( k ) |
Step 5: Calculate the grey correlation coefficients of each influencing factor as shown in Equation (9) as follows.
ς i ( k ) = min i   min k   Δ i   ( k ) + ρ   max i   max k ( k ) Δ i ( k ) Δ i ( k ) + ρ   max i   max k ( k ) Δ i ( k ) , ρ ( 0 , )
Here, ρ represents the resolution coefficient, which is generally taken as 0.5 in calculations. The larger its value, the smaller the difference between the correlation coefficients. Conversely, the smaller the value, the greater the difference.
Step 6: Calculate the overall correlation degree according to Equation (10).
r i = 1 n k = 1 n ς i ( k ) , k = 1,2 , 3 , , n
Step 7: Sort the obtained correlation coefficients. PM2.5, PM10, NO2, CO, average temperature, average relative humidity, average air pressure, and wind speed were taken as characteristic values, and AQI was used as the reference value. The data were sorted by grey correlation degree, as shown in Table 1.
Table 1 shows that PM2.5 has the highest correlation with AQI, followed by PM10, NO2, and CO. The average air pressure has the lowest correlation with AQI, which is 0.357, consistent with the ranking of the Pearson correlation coefficient analysis method.
The grey correlation degree is between the average temperature and average wind speed, and AQI is greater than 0.7; they can also be selected as the input values of the prediction model. According to the grey relational analysis method, PM2.5, PM10, NO2, CO, average temperature, and average wind speed are used as the inputs for predicting AQI for the subsequent prediction of AQI.

2.3. SSA-VMD-GRU Prediction Model

2.3.1. Sparrow Search Algorithm (SSA)

The sparrow search algorithm (SSA) is a new type of swarm intelligence optimization algorithm that can improve search efficiency and effectively avoid the problem of the model falling into local optimization. Its optimization ability is superior to intelligent optimization algorithms such as the particle swarm algorithm and the Gray Wolf algorithm. Shown in Figure 1 is the function detection result graph of the sparrow search algorithm.
Suppose the search range is d and the population size is n. The sparrow population is represented as Equation (11).
X = [ X 1 1 X 1 2 X 1 n X 2 1 X 2 2 X 2 n X n 1 X n 2 X n n ]
The fitness of all sparrows in the group is Equation (12). In Equation (12), f represents the fitness of different individual sparrows.
F x = [ f ( [ X 1 1 X 1 2 X 1 n ] ) f ( [ X 2 1 X 2 2 X 2 n ] ) f ( [ X n 1 X n 2 X n n ] ) ]

2.3.2. The Producer’s Location Update Strategy

In Equation (13), X is the position of the sparrows, t is the current iteration number, α is a random number between [0, 1], termax is the maximum number of iterations, R2 is the warning value, ST is the safety value, i is the number of sparrows, j is the dimension, Q is a random number between [0, 1] that conforms to a normal distribution, L is the 1 × d dimensional matrix, and the initial value of all its elements is 1.
X i , j t + 1 = { X i , j t + 1 exp ( i α t e r m a x ) , R 2 < ST X i , j t + 1 + Q L , R 2 S T
when R2 < ST, it is indicated as a safe state, and the producer enters a wide-range search mode. When R2 > ST, some sparrows sensed a potential threat, and all the sparrows quickly flew to other safe areas to ensure the safety of the entire group.

2.3.3. The Position Update Strategy for Followers

In Equation (14), X b e s t t and X w o r s t t are, respectively, the optimal position and the global worst position currently occupied by the producer; A+ = AT(AAT)−1, and A is a 1 × d matrix with elements of any number within [−1, 1].
X i , j t + 1 = { Q exp ( X w o r s t t X i , j t i 2 ) , i > n 2 X b e s t t + | X i , j t X b e s t t + 1 | A + L , i n 2

2.3.4. The Position Update Strategy of the Vigilante

In Equation (15), β represents the step size control parameter, λ is a random number between [0, 1], fg is the globally optimal fitness value, fw is the globally worst fitness value, and φ is the smallest constant to avoid zero division errors.
X i , j t + 1 = { X b e s t t + β | X i , j t X b e s t t | , f i > f g X i , j t + λ | X i , j t X w o r s t t | ( f i f g ) + φ , f i f g

2.4. Variational Mode Decomposition

Variational mode decomposition (VMD) decomposes the input nonlinear signal into the IMF components of multiple intrinsic modes, constructing the signal into a variational constraint problem, that is, solving how to minimize the sum of the bandwidths of modes with finite bandwidths of different centre frequencies. The original sequence of the AQI can be decomposed into a series of stable modal components using VMD, and the original sequence is random and nonlinear. The multiple IMFs obtained by decomposing VMD are defined as follows.
Uk(t) = Ak(t)cosk(t))
ω k ( t ) = θ k ( t ) = d θ k ( t ) d ( t )
In Equations (16) and (17), Uk(t) is the modal function, k = 1, 2, 3, …., k is the total number of modal components, ωk(t) is the instantaneous frequency of Uk(t), and θk(t) is the phase function.

2.4.1. Construct Variational Problems

This constraint problem aims to minimize the total bandwidth of the modal components, ensure that the decomposed subsequence retains its physical meaning, and avoid overfitting. Let the original input AQI signal be f(t), and t is the time variable. Through Hilbert transformation, the correspondence between each modal component and the corresponding index ωk(t) is achieved. Finally, the bandwidth is estimated based on the square norm of the demodulation signal gradient. The specific model is as follows:
f ( t ) = { m i n { U k ( t ) } , { ω k ( t ) } { k = 1 n t [ ( δ ( t ) + j π t ) U k ( t ) ] e j ω t 2 2 } s . t . k = 1 n U k ( t )
In Equation (18), δ(t) is the Dirac distribution, t is the partial derivative with respect to time t, and j is an imaginary unit. The   2 2 superscript represents square, and the   2 2 subscript represents the L2 norm.

2.4.2. Solve the Variational Problem

To ensure better accuracy, convert Equation (18) into an unconstrained problem using the quadratic penalty factor α and the Lagrange multiplication operator:
L ( { U k ( t ) } , { ω k ( t ) } , λ ( t ) ) = α k = 1 n t [ ( δ ( t ) + j π t ) U k ( t ) ] e j ω t 2 2 + f ( t ) k = 1 n U k ( t ) 2 2 + λ ( t ) , f ( t ) k = 1 n U k ( t )
In Equation (19), ˂ ˃ represents the inner product operation. The optimal solution is searched by iteratively updating parameters such as Ukn+1, λkn+1, and ωkn+1, and through the alternating method of multipliers. Steps are as follows:
Step 1: Initialize {Ukn+1}, {λkn+1}, and {ωkn+1}, where n = 0.
Step 2: Update the modal components Uk, as in Equation (20).
U ^ k n + 1 ( ω ) = f ^ ( ω ) i k U ^ i n ( ω ) + λ ^ n ( ω ) 2 1 + 2 α ( ω ω k n ) 2
In Equation (20), U ^ k ,     f ^ ( ω ) , and λ ^ ( ω ) are, respectively, the Fourier transforms of Uk, f(ω), and λ(ω).
Step 3: Execute the Step 2 loop to update ωkn+1, as in Equation (21).
ω k n + 1 = 0 ω | U ^ k n + 1 ( ω ) | 2 d ω 0 | U ^ k n + 1 ( ω ) | 2 d ω
Step 4: Update λkn+1, as in Equation (22), where τ represents the parameters of the noise tolerance.
λ ^ k n + 1 ( ω ) = λ ^ k n ( ω ) + τ [ f ^ ( ω ) k U ^ k n + 1 ( ω ) ]
Step 5: Given the determination p > 0, repeat Steps 2 to 4 until the convergence condition of Equation (23) is met; then, stop the loop iteration.
k = 1 n U ^ k n + 1 U ^ k n U ^ k n 2 2 < p

2.5. Gated Recurrent Unit Model

Gated recurrent unit (GRU) is a type of recurrent neural network that can better capture the dependency relationship between the long and short periods of time series and can solve the problem of AQI prediction under the influence of complex environments. The internal structure diagram of the GRU is shown in Figure 2.
ht−1 and xt are, respectively, the neuron output of the previous moment and the neuron output of the previous layer. zi is the update gate, and ri is the reset gate. At time t, the sigmoid function is used to activate the two gates, as shown in Equation (24).
{ Z t = s i g m o i d ( W z x t + U Z h t 1 ) r t = s i g m o i d ( W r x t + U r h t 1 )
In Equation (24), Wz and Uz are the weights of the update gate, and Wr and Ur are the weights of the reset gate. The internal state of the neuron at time t is
h ~ t = tanh ( r U h t 1 + W x t )
In Equation (25), is the Hadamard product, r is the reset gate state, and U and W are the weights of h ~ t . The final output is Equation (26).
h t = ( 1 Z t ) h ~ + Z t h t 1 )

3. Construction of the SSA-VMD-GRU Model

To further enhance the accuracy of short-term AQI prediction, considering that there are multiple factors influencing the AQI and the AQI sequence has the characteristics of nonlinearity and non-stationarity and due to the significant advantages of VMD in sequence stabilization processing, the excellent performance of SSA as a new meta-heuristic optimization algorithm in parameter optimization, and the outstanding performance of GRU in data prediction, this paper proposes an AQI prediction model based on SSA-VMD-GRU. The original AQI data sequence is decomposed through VMD to obtain the optimal decomposition subsequence, which can further improve the prediction accuracy of AQI. The specific steps are as follows:
Step 1: Select several characteristic values that affect the AQI as the input of the model.
Step 2: The AQI data sequence is decomposed using the VMD method to obtain K components.
Step 3: Set the population size of the sparrow search algorithm, the maximum number of iterations, and the search range of parameters.
Step 4: Establish the coupling model of the sparrow search algorithm (SSA) and the gated recurrent unit (GRU) (SSA-GRU).
Step 5: SSA-GRU prediction was performed on each IMF component decomposed by VMD to obtain K predicted values.
Step 6: Superimpose the K predicted values to obtain the final predicted value of the AQI.
To comprehensively evaluate the performance and prediction accuracy of the model, this paper adopts root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) as evaluation indicators. The specific calculation formulas are as follows:
R M S E = 1 N i = 1 N ( y ^ i y i )
M A E = 1 N i = 1 N ( y ^ i y i )
M A P E = 1 N i = 1 N | y ^ i y i y i |
In Equations (27)–(29), y ^ i represents the predicted value, and yi represents the true value. The SSA-VMD-GRU algorithm model flowchart is shown in Figure 3.

4. Discussions

PM2.5, PM10, NO2, CO, average temperature, and average wind speed are taken as the inputs for predicting AQI. We selected 2000 sets of valid data from traffic monitoring points in Nanning City, China, from the first to the fourth quarter of 2024–2025. The locations of data collection (10 data collection points in Nanning City, China) are shown in Figure 4.
The first 1400 sets were used as the training set and the last 600 sets as the test set. Before training and predicting the dataset, due to the different dimensions of pollutant concentration, average temperature, and average wind speed, direct prediction may lead to a large deviation in the prediction accuracy of the AQI. Therefore, it is necessary to normalize these data, which can accelerate the speed of gradient descent to find the optimal solution when the program is running. This paper uses the mapminmax () function for normalization, and its equations are the following:
y = ( y max y min ) · ( x x min ) x max x min + y mi n
[input, ps] = mapminmax(input_train)
Selecting an appropriate number of layers in a GRU recurrent network is of great significance for time series with strong volatility. Theoretically, the more layers a GRU recurrent network has, the better the learning ability of the model will be. This paper adopts the control variable method to debug and verify by only changing the number of network layers.
As shown in Table 2, when the number of GRU layers is two, the percentage error is the smallest. However, as the number of GRU layers increases, the percentage error gradually becomes larger. This is due to the model’s overlearning. Based on the above, a two-layer stacking design was chosen, with 256 and 128 neurons, respectively, and the initial learning rate of the GRU was 0.01.
In the SSA-VMD-GRU model, the population number N of sparrows is set to 20, and the maximum number of iterations M is set to 10; its parameter quantity is much lower than that of the 3D convolutional layer of CNN-GRU. The Adam Optimization Algorithm is adopted in the model training process to optimize the convergence ability of the GRU model. It combines the advantages of the root mean square propagation algorithm and the adaptive gradient algorithm. During the training process of the GRU model, the weights of neurons are continuously updated to optimize the model.
As shown in Figure 5, 4000 sets of data were selected from 8000 sets of AQI sequences for VMD and VMD components to obtain five modal components. From the perspective of frequency, it can be seen that the high-frequency and low-frequency components obtained by VMD change regularly, and the variation amplitude is relatively small, with basically no spectral aliasing phenomenon occurring. This indicates that VMD effectively separates sudden high-frequency pollution signals (such as emissions during morning and evening rush hours) and avoids information loss during spectral aliasing of CNN-GRU; it also demonstrates that the SSA optimization of GRU parameters can prevent insufficient fitting of nonlinear time series by random forest/XGBoost.

4.1. Relevant Evaluation Metrics and AQI Prediction Results of the Test Set and Training Set for the First-Quarter AQI Prediction

The relevant evaluation metrics of the test set and training set predicted by AQI in the first quarter of 2024 are shown in Table 3.
For the test set, compared with the traditional GRU model, the RMSE of the VMD-GRU model decreased by 5.1431, the MAE decreased by 3.7794, and the mean percentage error (MAPE) decreased by 7.16%. Compared with the VMD-GRU model, the RMSE of the SSA-VMD-GRU model decreased by 0.8092, the MAE decreased by 0.448, and the mean percentage error (MAPE) decreased by 1.59%.
The AQI prediction results for the first quarter are shown in Figure 6a, and the prediction errors are shown in Figure 6b. It can be seen from the figure that the prediction results of the AQI by the VMD-GRU model and the SSA-VMD-GRU model both have a considerable fit with the actual data, and the deviation degree of the GRU model is the highest. Compared with the GRU model, the VMD-GRU model has a significantly improved fitting degree, and the SSA-VMD-GRU model has the best fitting degree. Whether it is the ascending or descending stage of the curve, the trough or the peak, it performs well. The GRU model performed poorly, with the maximum error reaching up to 30. The errors of both the VMD-GRU model and the SSA-VMD-GRU model remained around 2, while the SSA-VMD-GRU model was superior in reducing errors.

4.2. Relevant Evaluation Metrics and AQI Prediction Results of the Test Set and Training Set for the Second-Quarter AQI Prediction

The relevant evaluation indicators for the AQI prediction in the second quarter of 2024 are shown in Table 4.
For the test set, compared with the traditional GRU model, the RMSE of the VMD-GRU model decreased by 5.2487, the MAE decreased by 3.7681, and the mean percentage error (MAPE) decreased by 7.67%. Compared with the GRU model, the RMSE of the SSA-VMD-GRU model decreased by 5.5582, the MAE decreased by 3.7681, and the mean percentage error (MAPE) decreased by 8.11%. Compared with the VMD-GRU model, the performance has been improved.
The AQI prediction results for the second quarter are shown in Figure 7a, and the prediction errors are shown in Figure 7b. It can be seen from the figure that the prediction result of the AQI by the SSA-VMD-GRU model fits the actual data relatively ideally, and the deviation degree of the GRU model is the highest. The SSA-VMD-GRU model has the best fitting degree. The GRU model performed poorly, with the maximum error reaching up to 22. The errors of both the VMD-GRU model and the SSA-VMD-GRU model remained around 2.2, while the SSA-VMD-GRU model was superior in reducing errors.

4.3. Relevant Evaluation Metrics and AQI Prediction Results of the Test Set and Training Set for the Third-Quarter AQI Prediction

The relevant evaluation metrics of the test set and training set predicted by the AQI in the third quarter of 2025 are shown in Table 5.
For the test set, compared with the traditional GRU model, the RMSE of the VMD-GRU model decreased by 2.9259, the MAE decreased by 1.6881, and the mean percentage error (MAPE) decreased by 4.45%. Compared with the GRU model, the RMSE of the SSA-VMD-GRU model decreased by 3.8797, the MAE decreased by 2.7006, and the mean percentage error (MAPE) decreased by 4.98%. Compared with the VMD-GRU model, the performance has been improved.
The AQI prediction results for the third quarter are shown in Figure 8a. It can be seen from the figure that GRU performs the worst and has the highest degree of deviation among 100 to 300 sets of data. Both the VMD-GRU and SSA-VMD-GRU models perform well. They predict stably when the AQI sequence rises or falls, but the SSA-VMD-GRU model performs better.
The prediction error is shown in Figure 8b. The GRU model still performs poorly, with the maximum error being 30. The errors of the VMD-GRU model and the SSA-VMD-GRU model remain around 2.5, and the SSA-VMD-GRU model performs better; the capture of pollution peaks is more accurate.

4.4. Relevant Evaluation Metrics and AQI Prediction Results of the Test Set and Training Set for the Fourth-Quarter AQI Prediction

The relevant evaluation metrics of the test set and training set predicted by the AQI in the fourth quarter of 2025 are shown in Table 6.
For the test set, compared with the traditional GRU model, the RMSE of the VMD-GRU model decreased by 5.2956, the MAE decreased by 3.7498, and the mean percentage error (MAPE) decreased by 7.69%. Compared with the VMD-GRU model, the RMSE of the SSA-VMD-GRU model decreased by 0.1547, the MAE decreased by 0.1295, and the mean percentage error (MAPE) decreased by 0.22%. Compared with the VMD-GRU model, the performance has slightly improved.
The AQI prediction results for the fourth quarter are shown in Figure 9a, and the prediction errors are shown in Figure 9b. It can be seen from the figure that the GRU model still performs poorly, with the maximum error being 30. To further verify the stability, we conducted a 95% confidence interval analysis of the prediction error; the errors of the VMD-GRU model and the SSA-VMD-GRU model remain around 1.5, and the SSA-VMD-GRU model performs better. The AQI fluctuated significantly in the fourth quarter, which might be related to the winter temperature inversion phenomenon. The model effectively captured this nonlinear feature through VMD.
Compared with the CNN-GRU model, the training speed of SSA-VMD-GRU has increased by more than 30%, which is attributed to the global optimization ability of SSA and avoids local optima. Specifically, the number of parameters is reduced by approximately 40%, making it more suitable for resource-constrained transportation terminals. From the previous analysis, in comparison to the publicly reported models, the SSA-VMD-GRU model’s errors are both numerically smaller and exhibit less variability, a clear testament to its enhanced stability, high efficiency, and good convergence, and it has a good effect in the prediction of the AQI.

5. Conclusions

Based on the relevant pollutant concentration data collected from traffic monitoring points, a traffic air quality analysis was conducted, including the analysis of the annual average pollutant concentration, the analysis of the seasonal variation characteristics of pollutant concentration, the analysis of pollutant concentration during morning and evening rush hours, and the analysis of pollution characteristics on weekends and working days. Secondly, the correlation degree between the monitored relevant data and AQI was analyzed to screen out the characteristic parameters. The GRU algorithm was optimized using the SSA and the VMD algorithm. Finally, the data sets of each quarter of the traffic air monitoring points in Nanning City in 2024–2025 were trained and tested through MATLAB R2024b. This paper proposes an algorithm that can overcome the peak prediction deviation caused by the endpoint effect in EMD. It can overcome the problem that VMD-BiLSTM is prone to falling into a local optimum due to unoptimized parameters. SSA-VMD-GRU, through adaptive frequency decomposition and global optimization, achieves an MAPE fluctuation of only ±0.3% in quarterly data, significantly enhancing its robustness. It was concluded that the SSA-VMD-GRU model had the best prediction effect on AQI, and the indicators such as RMSE, MAE, and MAPE were the best. Therefore, the AQI air quality prediction model based on deep learning studied in this paper can monitor the concentration of atmospheric pollutants and accurately predict the AQI. Compared with the existing technical methods and means, this model has better robustness when facing complex meteorological conditions, and it can meet the real-time warning requirements for traffic AQI. In future research, it is essential to further integrate artificial intelligence and big data technologies to develop more sophisticated models for analyzing and predicting traffic-related air quality. By incorporating multi-source real-time monitoring data and utilizing methods such as deep learning, higher precision in simulating pollution trends and enabling dynamic early-warning systems can be achieved. In addition, privacy is protected by exploring the federated learning framework while expanding to multi-city validation. This will ultimately provide the public with more timely and accurate air quality information, along with health protection recommendations. It is particularly important to note that all tests were conducted in one city (Nanning), and the transferability of this model to other regions or climates has not yet been proven. We plan to include multi-city validations in subsequent studies to demonstrate the universal adaptability of this model.

Author Contributions

W.H.: Writing—original draft preparation and visualization. X.H.: Resources, methodology, and writing—review and editing. Y.Z.: Investigation. H.Z.: Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of Chongzuo under Grant Chongke 20231206 and the Guangxi Market Supervision Administration Science and Technology Program Project under grant GXSJKJ2024-18.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data generated or analyzed during this study are included in this published article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AQIAir quality index
SSASparrow search algorithm
VMDVariational mode decomposition
GRUGated recurrent unit

References

  1. Mogollón-Sotelo, C.; Casallas, A.; Vidal, S.; Celis, N.; Ferro, C.; Belalcazar, L. A support vector machine model to forecast ground-level PM2.5 in a highly populated city with a complex terrain. Air Qual. Atmos. Health 2020, 14, 399–409. [Google Scholar] [CrossRef]
  2. Yu, Y.; Yu, J.J.Q.; Li, V.O.K.; Lam, J.C.K. A Novel Interpolation-SVT Approach for Recovering Missing Low-Rank Air Quality Data. IEEE Access 2020, 8, 74291–74308. [Google Scholar] [CrossRef]
  3. Park, S.; Woo, S.; Lim, C.W. Predicting PM10 and PM2.5 Concentration in Container Ports: A Deep Learning Approach. Transp. Res. Part D Transp. Environ. 2023, 115, 103601. [Google Scholar] [CrossRef]
  4. Byun, D.W. Dynamically consistent formulations in meteorological and air quality models for multiscale atmospheric studies. part i: Governing equations in a generalized coordinate system. J. Atmos. Sci. 1999, 56, 3789–3807. [Google Scholar] [CrossRef]
  5. Huang, Y.; Yu, J.; Dai, X.; Huang, Z.; Li, Y. Air-Quality Prediction Based on the EMD-IPSO-LSTM Combination Model. Sustainability 2022, 14, 4889. [Google Scholar] [CrossRef]
  6. Wu, Y.; Qian, C.; Huang, H. Enhanced Air Quality Prediction Using a Coupled DVMD Informer-CNN-LSTM Model Optimized with Dung Beetle Algorithm. Entropy 2024, 26, 534. [Google Scholar] [CrossRef]
  7. Reddy, K.Y.; Krishna, K.V.S.G.M. Vehicular pollution prediction using HWTO-ANFIS model in urban areas of Hyderabad City. Environ. Prog. Sustain. Energy 2023, 42, 1. [Google Scholar] [CrossRef]
  8. Zhang, J.; Luo, Z.; Yang, Z. Research on Air Quality Prediction Based on LSTM-Transformer with Adaptive Temporal Attention Mechanism. In Proceedings of the 2023 2nd International Conference on Artificial Intelligence and Intelligent Information Processing (AIIIP), Hangzhou, China, 27–29 October 2023; IEEE: New York, NY, USA. [Google Scholar] [CrossRef]
  9. Li, H.; Wang, J.; Li, R.; Lu, H. Novel analysis–forecast system based on multi-objective optimization for air quality index. J. Clean. Prod. 2019, 208, 1365–1383. [Google Scholar] [CrossRef]
  10. Yan, R.; Liao, J.; Yang, J.; Sun, W.; Nong, M.; Li, F. Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering. Expert Syst. Appl. 2020, 169, 114513. [Google Scholar] [CrossRef]
  11. Huang, W.; Wang, H.; Wei, Y. Endogenous or Exogenous? Examining Trans-Boundary Air Pollution by Using the Air Quality Index (AQI):A Case Study of 30 Provinces and Autonomous Regions in China. Sustainability 2018, 10, 4220. [Google Scholar] [CrossRef]
  12. Laudan, J.; Banzhaf, S.; Khan, B.; Nagel, K. Air Quality-Driven Traffic Management Using High-Resolution Urban Climate Modeling Coupled with a Large Traffic Simulation. Atmosphere 2025, 16, 128. [Google Scholar] [CrossRef]
  13. Wang, Z.; Chen, L.; Zhu, J.; Chen, H.; Yuan, H. Double decomposition and optimal combination ensemble learning approach for interval-valued aqi forecasting using streaming data. Environ. Sci. Pollut. Res. Int. 2020, 27, 37802–37817. [Google Scholar] [CrossRef] [PubMed]
  14. Xie, J.; Liu, F.; Liu, S.; Jiang, X. An Approach to Spatiotemporal Air Quality Prediction Integrating SwinLSTM and Kriging Methods. Sustainability 2025, 17, 2918. [Google Scholar] [CrossRef]
  15. Yang, Z.C. DCT-based Least-Squares Predictive Model for Hourly AQI Fluctuation Forecasting. J. Environ. Inform. 2020, 36, 58–69. [Google Scholar] [CrossRef]
  16. Yuan, G.; Yang, W. Evaluating China’s Air Pollution Control Policy with Extended AQI Indicator System: Example of the Beijing-Tianjin-Hebei Region. Sustainability 2019, 11, 939. [Google Scholar] [CrossRef]
  17. Zhou, W.; Sun, X.; Wan, J. Air Quality Index Prediction Based on IAO Optimized HKELM. Intell. Comput. Appl. 2023, 13, 50–56+66. [Google Scholar]
  18. Sharma, E.; Deo, R.C.; Prasad, R.; Parisi, A.V.; Raj, N. Deep Air Quality Forecasts: Suspended Particulate Matter Modeling With Convolutional Neural and Long Short-Term Memory Networks. IEEE Access 2020, 8, 209503–209516. [Google Scholar] [CrossRef]
  19. Wang, Y.; Li, H.; Dai, X.; Zhou, L. An interval AQI combination prediction model based on multiple data decomposition and information aggregation operator. Environ. Sci. Pollut. Res. 2024, 31, 14284–14302. [Google Scholar] [CrossRef]
  20. Ahmad, S.; Ahmad, T. AQI prediction using layer recurrent neural network model: A new approach. Environ. Monit. Assess. 2023, 195, 1180. [Google Scholar] [CrossRef]
  21. Sarkar, N.; Keserwani, P.K.; Govil, M.C. Prediction of AQI for Urban Metropolis Using MLSTM-GRU Model. In The Future of Artificial Intelligence and Robotics; International Conference on Deep Learning. Artificial Intelligence and Robotics; Springer: Berlin/Heidelberg, Germany, 2024; pp. 483–492. [Google Scholar]
  22. Manna, T.; Anitha, A. Hybridization of rough set-wrapper method with regularized combinational LSTM for seasonal air quality index prediction. Neural Comput. Appl. 2024, 36, 2921–2940. [Google Scholar] [CrossRef]
  23. Nguyen, A.T.; Pham, D.H.; Oo, B.L.; Ahn, Y.; Lim, B.T.H. Predicting air quality index using attention hybrid deep learning and quantum-inspired particle swarm optimization. J. Big Data 2024, 11, 71. [Google Scholar] [CrossRef]
  24. Natarajan, S.K.; Shanmurthy, P.; Arockiam, D.; Balusamy, B.; Selvarajan, S. Optimized machine learning model for air quality index prediction in major cities in India. Sci. Rep. 2024, 14, 1. [Google Scholar] [CrossRef]
  25. Hu, Y.; Ding, Y.; Jiang, W. Geographically Aware Air Quality Prediction Through CNN-LSTM-KAN Hybrid Modeling with Climatic and Topographic Differentiation. Atmosphere 2025, 16, 513. [Google Scholar] [CrossRef]
  26. Kim, K.E.; Ma, M.R.; Lee, K.W. Prediction of spatio-temporal AQI data. Commun. Stat. Appl. Methods 2023, 30, 119–133. [Google Scholar] [CrossRef]
  27. Ketu, S. Spatial Air Quality Index and Air Pollutant Concentration prediction using Linear Regression based Recursive Feature Elimination with Random Forest Regression(RFERF):a case study in India. Nat. Hazards 2022, 114, 2109–2138. [Google Scholar] [CrossRef]
  28. Shivram, U.; Murugesan, E.; Jones, G.F.; Vijayakumar, D. Breathing Smart: Advanced Models and Metrics for Accurate Air Quality Prediction and Health Impact Analysis. Int. Res. J. Adv. Eng. Hub (IRJAEH) 2025, 3, 190–199. [Google Scholar] [CrossRef]
  29. Zhang, J.; Ding, W. Prediction of Air Pollutants Concentration Based on an Extreme Learning Machine: The Case of Hong Kong. Int. J. Environ. Res. Public Health 2017, 14, 114. [Google Scholar] [CrossRef]
  30. Deng, Y.; Xu, T.; Sun, Z. A hybrid multi-scale fusion paradigm for AQI prediction based on the secondary decomposition. Environ. Sci. Pollut. Res. 2024, 31, 32694–32713. [Google Scholar] [CrossRef]
  31. Kataria, A.; Puri, V. AI- and IoT-based hybrid model for air quality prediction in a smart city with network assistance. IET Netw. 2022, 11, 221–233. [Google Scholar] [CrossRef]
  32. Mo, X.; Zhang, L.; Li, H.; Qu, Z. A Novel Air Quality Early-Warning System Based on Artificial Intelligence. Int. J. Environ. Res. Public Health 2019, 16, 3505. [Google Scholar] [CrossRef]
  33. Sun, W.; Huang, C. A hybrid air pollutant concentration prediction model combining secondary decomposition and sequence reconstruction. Environ. Pollut. 2020, 266, 115216. [Google Scholar] [CrossRef]
  34. Zheng, H.; Liu, J.; Tang, X.; Wang, Z.; Wu, H.; Yan, P.; Wang, W. Improvement of the Real-time PM2.5 Forecast over the Beijing-Tianjin-Hebei Region using an Optimal Interpolation Data Assimilation Method. Aerosol Air Qual. Res. 2018, 18, 1305–1316. [Google Scholar] [CrossRef]
  35. Zamani, M. PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data. Atmosphere 2019, 10, 373. [Google Scholar] [CrossRef]
  36. Li, Y.; Jiang, P.; She, Q.; Lin, G. Research on air pollutant concentration prediction method based on self-adaptive neuro-fuzzy weighted extreme learning machine. Environ. Pollut. 2018, 241, 1115–1127. [Google Scholar] [CrossRef]
  37. Shang, Z.; Deng, T.; He, J.; Duan, X. A novel model for hourly PM2.5 concentration prediction based on CART and EELM. Sci. Total Environ. 2018, 651, 30433052. [Google Scholar] [CrossRef]
Figure 1. The sparrow search algorithm detects the results of the function.
Figure 1. The sparrow search algorithm detects the results of the function.
Sustainability 17 11233 g001
Figure 2. GRU network structure.
Figure 2. GRU network structure.
Sustainability 17 11233 g002
Figure 3. SSA-VMD-GRU model flowchart.
Figure 3. SSA-VMD-GRU model flowchart.
Sustainability 17 11233 g003
Figure 4. Location of 10 data collection points in Nanning City, China (red marked points).
Figure 4. Location of 10 data collection points in Nanning City, China (red marked points).
Sustainability 17 11233 g004
Figure 5. VMD component diagram.
Figure 5. VMD component diagram.
Sustainability 17 11233 g005
Figure 6. Comparison of prediction results and error comparison of GRU, VMD-GRU, and SSA-VMD-GRU algorithms in the first quarter.
Figure 6. Comparison of prediction results and error comparison of GRU, VMD-GRU, and SSA-VMD-GRU algorithms in the first quarter.
Sustainability 17 11233 g006
Figure 7. Comparison of prediction results and error comparison of GRU, VMD-GRU, and SSA-VMD-GRU algorithms in the second quarter.
Figure 7. Comparison of prediction results and error comparison of GRU, VMD-GRU, and SSA-VMD-GRU algorithms in the second quarter.
Sustainability 17 11233 g007
Figure 8. Comparison of prediction results and error comparison of GRU, VMD-GRU, and SSA-VMD-GRU algorithms in the third quarter.
Figure 8. Comparison of prediction results and error comparison of GRU, VMD-GRU, and SSA-VMD-GRU algorithms in the third quarter.
Sustainability 17 11233 g008
Figure 9. Comparison of prediction results and error comparison of GRU, VMD−GRU, and SSA−VMD−GRU algorithms in the fourth quarter.
Figure 9. Comparison of prediction results and error comparison of GRU, VMD−GRU, and SSA−VMD−GRU algorithms in the fourth quarter.
Sustainability 17 11233 g009
Table 1. Grey correlation method relevance ranking.
Table 1. Grey correlation method relevance ranking.
ItemsCorrelation Ranking
PM2.50.951
PM100.941
NO20.891
CO0.811
Average temperature0.792
Air velocity0.721
Average relative humidity0.478
Average air pressure0.357
Table 2. Average percentage error for different GRU network layers.
Table 2. Average percentage error for different GRU network layers.
GRU LayersTNMAPE/%
1707.69
2705.83
3706.35
4708.67
Table 3. The models predicted the evaluation effect in the first quarter.
Table 3. The models predicted the evaluation effect in the first quarter.
Model MethodTraining SetTesting Set
RMSEMAEMAPERMSEMAEMAPE
GRU4.65202.97038.86%7.60955.452010.58%
VMD-GRU1.10040.29192.23%2.46641.67263.42%
SSA-VMD-GRU0.94790.21921.91%1.65721.22461.83%
Table 4. The models predicted the evaluation effect in the second quarter.
Table 4. The models predicted the evaluation effect in the second quarter.
Model MethodTraining SetTesting Set
RMSEMAEMAPERMSEMAEMAPE
GRU6.31724.55586.51%6.71714.93149.71%
VMD-GRU1.90131.47732.12%1.46841.16332.04%
SSA-VMD-GRU1.44111.10741.64%1.15890.902371.60%
Table 5. The models predicted the evaluation effect in the third quarter.
Table 5. The models predicted the evaluation effect in the third quarter.
Model MethodTraining SetTesting Set
RMSEMAEMAPERMSEMAEMAPE
GRU4.91933.67065.96%6.35944.13867.36%
VMD-GRU1.28291.01011.59%3.16352.45052.91%
SSA-VMD-GRU1.02440.79781.30%2.47971.84382.38%
Table 6. The models predicted the evaluation effect in the fourth quarter.
Table 6. The models predicted the evaluation effect in the fourth quarter.
Model MethodTraining SetTesting Set
RMSEMAEMAPERMSEMAEMAPE
GRU6.52064.60106.82%6.75974.90469.72%
VMD-GRU1.91351.48492.81%1.46411.15482.03%
SSA-VMD-GRU1.58761.22621.79%1.30941.02531.81%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, W.; Huang, X.; Zhang, Y.; Zhu, H. A Hybrid SSA-VMD-GRU Model for Real-Time Traffic-Related Air Quality Index Prediction: Development and Validation. Sustainability 2025, 17, 11233. https://doi.org/10.3390/su172411233

AMA Style

Huang W, Huang X, Zhang Y, Zhu H. A Hybrid SSA-VMD-GRU Model for Real-Time Traffic-Related Air Quality Index Prediction: Development and Validation. Sustainability. 2025; 17(24):11233. https://doi.org/10.3390/su172411233

Chicago/Turabian Style

Huang, Wenzhe, Xiaoping Huang, Yaqiong Zhang, and Haoming Zhu. 2025. "A Hybrid SSA-VMD-GRU Model for Real-Time Traffic-Related Air Quality Index Prediction: Development and Validation" Sustainability 17, no. 24: 11233. https://doi.org/10.3390/su172411233

APA Style

Huang, W., Huang, X., Zhang, Y., & Zhu, H. (2025). A Hybrid SSA-VMD-GRU Model for Real-Time Traffic-Related Air Quality Index Prediction: Development and Validation. Sustainability, 17(24), 11233. https://doi.org/10.3390/su172411233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop