Next Article in Journal
Sensory and Emotional Components in Tourist Memories of Wildlife Encounters: Intense, Detailed, and Long-Lasting Recollections of Individual Incidents
Next Article in Special Issue
An Inner Dependence Analysis Dynamic Decision-Making Framework
Previous Article in Journal
Measuring the Impact of Greece as a Safe Branding Tourist Destination: Evidence from Spain and Greece
Previous Article in Special Issue
Ranking Decision Making for Eco-Efficiency Using Operational, Energy, and Environmental Efficiency
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Role of GARCH Effect on the Prediction of Air Pollution

1
Department of Industrial Education and Technology, National Changhua University of Education, Changhua City 50007, Taiwan
2
Department of Business Administration, National Changhua University of Education, Changhua City 50007, Taiwan
3
Department of Banking and Finance, National Chiayi University, Chiayi City 60054, Taiwan
*
Authors to whom correspondence should be addressed.
Sustainability 2022, 14(8), 4459; https://doi.org/10.3390/su14084459
Submission received: 28 February 2022 / Revised: 17 March 2022 / Accepted: 25 March 2022 / Published: 8 April 2022

Abstract

:
Air pollution prediction is an important issue for regulators and practitioners in a sustainable era. Air pollution, especially PM2.5 resulting from industrialization, has fostered a wave of global weather migration and jeopardized human health in the past three decades. Taiwan has evolved as a highly developed economy and has a severe PM2.5 pollution problem. Thus, the control of PM2.5 is a critical issue for regulators, practitioners and academics. More recently, GA-SVM, an artificial-intelligence-based approach, has become a preferred prediction model, attributed to the advances in computer technology. However, hourly observation of PM2.5 concentration tends to present the GARCH effect. The objective of this study is to explore whether the integration of GA-SVM with the GARCH model can build a more accurate air pollution prediction model. The study adopts central Taiwan, the region with the worst level of PM2.5, as the source of observations. The empirical implementation of this study took a two-step approach; first, we examined the potential existence of the GARCH effect on the observed PM2.5 data. Second, we built a GA-SVM model integrated with the GARCH framework to predict the 8 h PM2.5 concentration of the sample region. The empirical results indicate that the prediction performance of our proposed alternative model outperformed the traditional SVM and GA-SVM models in terms of both MAPE and RMSE. The findings in this study provide evidence to support our expectation that adopting the SVM-based approach model for PM2.5 prediction is appropriate, and that prediction performance can be improved by integrating the GARCH model. Moreover, consistent with our prior expectation, the evidence further supports that taking the GARCH effect into account in the GA-SVM model significantly improves the accuracy of prediction. To the knowledge of the authors, this study is the first to attempt to integrate the GARCH effect into the GA-SVM model in the prediction of PM2.5. In summary, with regard to the development of sustainability for both regulators and practitioners, our results strongly encourage them to take the GARCH effect into consideration in air pollution prediction if a regression-based model is to be adopted. Furthermore, this study may shed light on the application of the GARCH model and SVM models in the air pollution prediction literature.

1. Introduction

Air pollution has long been an important concern for all countries. In particular, the level of air pollution is often closely related to the industrialization of each country. Although industrialization brings economic growth and prosperity to society, it is often accompanied by factors causing the deterioration of the ecological environment, such as waste and air pollution. For a long time, developed countries, in response to citizens’ pursuit of a cleaner environment and better life quality, have had to deal with the air pollution problems associated with high levels of industrialization. In recent years, as a result of climate change and the rapid economic development of emerging economies, air pollution has become an important issue of global concern. Accordingly, highly industrialized countries held the United Nations Framework Convention on Climate Change (UNFCCC) in 1997 and proposed the Kyoto Protocol, advocating the 55 participating member countries’ commitment to save energy and reduce carbon emissions. The goal was to stabilize the concentration of greenhouse gases in the atmosphere, while balancing food production and economic development. In addition, the private sector also raised funds to set up the non-governmental organization the Green Peace Organization (GPO), which is dedicated to research on environmental protection and the promotion of air pollution prevention measures.
In Taiwan, economic development and air pollution show a considerable degree of correlation. In the 1950s, Taiwan implemented the first phase of economic construction plans to cultivate the development of labor-intensive light industrialization. The economic growth in this decade, with an 8.7% annual growth rate, did not cause serious air pollution problems. In the 1960s, Taiwan set up a manufacturing export zone and actively developed the export industry. The average annual economic growth rate during this period was 9.8%. In the 1970s, the world experienced an oil energy crisis. The Taiwan government adopted a domestic demand-oriented strategy and promoted the Ten Major Construction Projects, which increased public expenditure, improved infrastructure and actively developed the petrochemical industry. Although Taiwan is one of the four Asian Tiger countries, the development of the petrochemical industry has brought economic growth, with the consequence of Taiwan’s current air pollution problem. In the 1980s, Taiwan actively developed technology-intensive industries and established its first science industrial park in Hsinchu, with an average annual economic growth rate of 8.5% throughout this decade. The development of the technology industry, especially the semiconductor supply chain, which contributed greatly to Taiwan’s economic growth, involved immense power and energy consumption. In order to fully meet the power demand, the government increased the production capacity of coal-fired power generation for sufficient power supply. However, the seriousness of air pollution was also aggravated [1].
In response to people’s demands for better life quality, the Taiwan government has worked on the prevention and control of air pollution, especially fine particles (PM2.5) [2]. However, in the past, when the government formulated air pollution control measures, it was often necessary to take into account the costs incurred by the industry and the need to promote domestic economic development. As a result, the effect of the early prevention and control of air pollution was not significant, especially for PM2.5. The harm caused by PM2.5 to human health has attracted international attention, because the spread of PM2.5 results not only in a deterioration of air quality but also in fatal health problems [3]. In recent years, the GPO has invested a lot of resources toward empirical research on the harm caused by air pollution to human health. According to a National Policy Research Report from the National Policy Research Foundation [4], based on an air pollution research by Harvard University on Asian countries including Taiwan, South Korea, Japan, China and the Philippines, the GPO found that coal-fired power plants not only directly emit PM2.5, but also emit sulfur dioxide (SO2), nitrogen oxides (NOx), soot and dust, etc., thus stimulating the formation and increase in PM2.5. These fine particles enter the lungs of the human body through the respiratory system, spreading through blood circulation and causing diseases such as asthma, respiratory diseases, cancer or even death. As a result, the government’s energy choices are not only closely related to carbon emissions, but also directly related to air pollution, which has a considerable negative impact on human health.
In fact, in recent years, there have been numerous cases of harm to human health caused by air pollution [3,5]. In 2010, PM2.5 from air pollution caused more than 3 million cases of premature death in the world. According to the 2013 annual report on air quality monitoring by the Environmental Protection Agency, the annual average concentration of PM2.5 in Taiwan was 24 μg/m3, which was 2.4 times higher than the World Health Organization (WHO) standard value [1]. With that PM2.5 concentration, the risk of lung cancer and childhood asthma increased by 15% and the risk of stroke, heart disease and chronic respiratory disease increased by nearly 25%. The seriousness of Taiwan’s air pollution problem did not come to the surface until the Changhua Guoguang Petrochemical Development Project Evaluation in 2010 [6]. With government’s promotion of a non-nuclear policy, the supply of green energy is still insufficient, thus the demand for coal-fired power has continued to increase in recent years. The government is still planning to build new coal-fired thermal power plants even though thermal power generation has accounted for more than 80% of annual power generation structure in Taiwan. Therefore, in this study we show that, in order to maintain citizens’ health, accurately predicting the level of PM2.5 and analyzing the sources of air pollution are extremely important issues for the government to formulate air pollution policies and to control target levels.
In terms of forecasting methodology, in order to deal with autoregressive characteristics of observed time series data, the General Autoregressive Conditional Heteroscedasticity model (GARCH) is the most widely used method. With recent progress of computer programming and development of big data analysis, besides traditional statistical-based forecasting methods, the machine-learning-based approach is preferred in industry and academia. Among them, neural network systems and support vector machine (SVM) models are the most commonly used. Recently, the application of methods has evolved towards integrating models. For example, to improve prediction accuracy, neural networks are combined with wavelet analysis [7], while in terms of SVM, the combination of genetic algorithm (GA-SVM) is preferred [8].
Regarding the improvement of prediction accuracy, in addition to considering the integration of methods, the attributes of observation data are also very important. The sample data examined in this study is the hourly air pollution index from the EPA monitoring stations, which may be affected by factors such as terrain, atmosphere and so on. The air pollution in the previous hourly period may not have dissipated yet, thus the deferred effects on air pollution index observed in the next period may lead to autoregression. Conventional prediction models have been built on the basis of regression analysis. In the classical regression model, residual autoregression and heteroscedasticity should not exist; therefore, their presence may produce bias and violate the underlying assumptions [9,10]. In addition, GA-SVM is a machine-learning-based AI approach. Accuracy of prediction depends not only on quality of data, but also on availability of input (correlated) variables. Thus, a sufficient number of input variables (for more training) will be helpful to generate better prediction results. If we can provide more related variables in the training process, we will be able to produce better prediction accuracy. Our intention is to examine variables, from the GARCH model, which can improve the accuracy of the prediction model for air pollution.
Accordingly, the purpose of this study is, using key observation data from local monitoring stations, to analyze whether the hourly PM2.5 presents an autoregressive conditional heteroscedasticity (ARCH) effect and whether incorporating the GARCH effect into the GA-SVM model can improve the performance of an air pollution prediction model and further identify factors affecting PM2.5. In the first stage of analysis, the examination starts with an ADF test for stationary in our time series data, followed by an LM test and an ARCH test to investigate the existence of autoregression and ARCH effect in our dataset. We further estimate using a GARCH(1,1) model to confirm the existence of the GARCH effect and integrate the GARCH effect into the GA-SVM model with PM2.5 as the prediction variable in the second stage of analysis [11]. Empirical results indicate that the prediction performance of our proposed alternative model outperformed traditional SVM and GA-SVM models in terms of both MAPE and RMSE as accuracy measures.
Consistent with previous SVM literature, which suggests a trend to integrate various approach into SVM model [8], our empirical results provide evidence to support our expectation that adopting an SVM-based approach model for PM2.5 prediction is appropriate and that prediction performance can be improved by integrating models, such as incorporate the GARCH effect into a GA-SVM-based approach. Moreover, consistent with our prior expectation, evidence further supports that taking the GARCH effect into account, in a GA-SVM model, clearly improves the accuracy of prediction. To the knowledge of the authors, this study is the first to integrate the GARCH effect into the GA-SVM model in the prediction of PM2.5. In summary, with regard to the development of sustainability for both regulators and practitioners, our results strongly encourage them to take the GARCH effect into consideration in air pollution prediction if the regression-based model is to be adopted. Furthermore, this study may shed light on the application of the GARCH model, as well as machine learning methods, in the air pollution prediction literature.

2. Materials and Methods

2.1. Literature Review

After the announcement of Kyoto Protocol, the prediction and management of air pollution has become an important common concern among industry, government and academics around the world. Scholars have put a lot of effort into empirical research and have obtained quite significant results. Nevertheless, the pursuit of the best predictive model has not reached a consistent conclusion. Here, we review key studies from the literature relevant to this study.
Zickus et al. employed daily average of PM10 concentrations monitored in the Terro region of Helsinki, Finland from 1996 to 1999 as a sample and variables such as wind speed, wind direction, air pressure, humidity, precipitation, temperature, dew point temperature, and terrain to predict daily PM10 concentration in 1999 with a three-year training period from 1996 to 1998 [12]. Empirical methods include logistic regression, decision tree, multiple adaptive regression splines (MARS) and artificial neural networks (ANN). Results show that logistic regression, multiple adaptive regression, and neural networks perform more consistently, while decision trees perform significantly worse. Dudot et al. employed a neural network combined with a neural classifier to predict hourly maximum ozone concentrations in central France [13]. The neural model is based on the MLP structure, and the sample data is collected from the French air quality agency LIG’AIR, which has 15 ground monitoring stations, and this study only focuses on three stations in Orleans. The daily maximum data for hourly ozone mean concentrations from 1999 to 2003 were adopted. Since the ozone observation peaks in summer, the authors only used data from April to September. Nonetheless, the model developed can be used to make valid forecasts throughout the year. Results showed that the use of neural networks for ozone peaks produced better predictions, with a 92% concordance index, MAE = RMSE = 15 μg/m3, MBE = 5 μg/m3, as compared to the European threshold for hourly ozone of 180 μg/m3. In order to improve the accuracy of prediction, the authors use a neural classifier with a sigmoid function in the output layer. The output range of the network was [0,1], which can be interpreted as the probability of exceeding the standard. Comparing this model with logistic regression shows that the prediction accuracy index using the neural classifier is 78%, compared to 65% to 72% for the classical MLP. Voukantsis et al. collected air quality monitoring data from 2001 to 2003 at Kallio and Vallia stations in Thessaloniki, Greece and at Sindos and Agias Sofias stations in Helsinki, Finland to predict daily average of PM10 concentration in 2003 through a neural network model [14]. Research results show that the predicted R2 values of PM10 at the Kallio, Vallila, Sindos and Agias Sofias stations were 0.587, 0.639, 0.472 and 0.427, respectively. RMSE was 5.884, 7.128, 16.387 and 23.577 μg/m3, respectively. Higher values of R2 did not correlate with lower RMSE. In addition, the concordance index between measured and modeled daily PM10 concentrations was between 0.8 and 0.85, while the predicted PM10 kappa index for both cities reached 60%. Empirical findings indicate that despite significant differences in the environment of the two sample cities, the performance of the PM10 prediction model was not significantly different.
Russo et al. applied a neural network to predict air quality, with hourly NO2 concentration data collected from Chelas and Avenida da Liberdade stations in Lisbon, Portugal from 1 January 2002 to 31 December 2006 [15]. The prediction is divided into two parts. In the first part, only the period from 2002 to 2005 was considered and the data of the first three years was used as training data to predict the year of 2005. Next, the data from 2002, 2003 and 2005 were adopted for training to predict 2004, and so on for each year. In the second part, the authors used the parameter values obtained in the first part to predict the NO2 concentration value in 2006. Selected variables include NO2, NO, CO as well as temperature, wind speed, wind direction, humidity, air pressure, solar radiation, and atmospheric boundary height. Empirical results showed that the RMSE is about 20 or above. Tamas et al. used ANN to predict the next 24 h ozone concentration in Corsica, France [16]. Sample data were collected from four stations, Canetto, Giraud, Montesoro and Sposata, two in cities and two in suburbs, from 2008 to 2012 to establish an MLP-based ANN early warning model. The input variables include the concentrations of O3, NO2, MET and TI, and these variables were configured into five combined models for prediction, namely O3, O3+NO2, O3+NO3+MET, O3+NO2+TI and O3+NO2+MET+TI. Empirical results showed that for the Canetto station, the prediction of the MLP (O3+NO2+TI) model was more accurate, the RMSE was 17.31 μg/m3, and the MAE was 13.66 μg/m3; for the Sposata station, the MLP (O3+NO2+MET+TI) model was more accurate, the RMSE was 16.90 μg/m3, and the MAE was 13.31 μg/m3; for the Giraud station, the prediction of the MLP (O3+NO2+MET+TI) model was more accurate, the RMSE was 16.53 μg/m3 and the MAE was 12.82 μg/m3; for the Montesoro station, the MLP (O3+NO2+MET+TI) model was more accurate in prediction, the RMSE was 15.90 μg/m3 and the MAE was 12.30 μg/m3. Zhou et al. proposed an EEMD-GRNN (ensemble empirical mode decomposition—general regression neural network) model for predicting PM2.5 concentrations [17]. EEMD decomposes the raw data of PM2.5 into several intrinsic mode functions (IMFs), while GRNN is used to predict each IMF. This study adopted input variables obtained from a principal component regression (PCR) model to train a hybrid EEMD-GRNN model to remove redundancy. Training data was from 1 January to 1 November in 2013, and the model was tested with data from 2 November to 21 November 2013 in Xi’an Province, China. The resulting MAE values of the multiple linear regression (MLR) model, PCR model, traditional integrated moving average autoregression (ARIMA) model, GRNN model and EEMD-GRNN model were 29.30, 26.35, 23.55, 23.50 and 19.80, respectively. MAPE values were 37.00, 32.75, 29.67, 31.25 and 28.01, respectively, and MSE values were 37.42, 31.30, 28.98, 32.31 and 29.41, respectively. The EEMD-GRNN model outperformed the MLR model, PCR model, ARIMA model and GRNN model without EEMD and can be adopted to develop an early warning system for air quality. Elangasinghe et al. used neural networks combined with k-means clustering to grasp the complex time series of PM10 and PM2.5 concentrations in coastal New Zealand [18]. Wind speed, wind direction, solar radiation, temperature and relative humidity were used as variables and the adjacent pollution sources were used as references. Results support improved prediction accuracy based on the values of the correlation coefficient and RMSE. The correlation coefficient between observed and predicted PM2.5 concentrations increased from 0.77 to 0.79, and PM10 concentrations increased from 0.63 to 0.69. The RMSE index values of PM2.5 and PM10 were reduced from 5.00 to 4.74 and from 6.77 to 6.34, respectively. Kristiani et al. implemented short-term prediction of PM2.5 in Taiwan using the long short-term memory (LSTM) deep learning method [19]. Results indicate that LSTM had the lowest RMSE value at 1.9, as compared to other models such as CNN at 3.5, Bi-LSTM at 2.5, Bi-GRU at 2.7 and RNN at 2.4.
Zheng et al. applied neural networks and linear regression as spatial and temporal prediction models and combined with regression trees [20]. The study included monitoring data from 43 cities in China from 1 May 2014 to 30 April 2015 and combined temporal predictors, spatial predictors, prediction aggregators and deformation predictors to forecast air quality at Beijing, Shanghai, Tianjin, and Guangzhou for the following 48 h. Results indicate that the model can achieve an accuracy of 0.75 in the first 6 h and 0.6 in the next 7 to 12 h. Even though forecast accuracy in Beijing was the worst among the four cities, it was still better than the weather forecasting model (WFM) adopted by the Beijing Environmental Protection Monitoring Center. Feng et al. proposed a hybrid model that combines air quality trajectory analysis and wavelet transformation with a neural network (ANN) to predict PM2.5 concentrations beyond two days, and its accuracy was observed [7]. The sample data was collected from 13 air quality monitoring stations in Tianjin and Hebei, China from 1 September 2013 to 31 October 2014. Wind speed and wind direction were set as parameters affecting air quality. The prediction results show that this hybrid model can effectively improve prediction accuracy of PM2.5, and its RMSE value can be reduced by as much as 40% on average. In particular, the days with high PM2.5 concentration can almost be predicted by wavelet decomposition, and the detection rate (DR) can reach 90% on average for the alert threshold set by the hybrid model.
Wang et al. established an urban air quality prediction system based on the weather research forecast and chemistry (WRF-Chem) model and a regional haze weather forecast system based on the Regional Atmospheric Environment Modeling System (RegAEMS) and applied to Shanghai and Nanjing in the Yangtze River Delta region of China [21]. The study conducted a one-year forecast in Shanghai from May 2009 to April 2010 and a one-month test in Nanjing in October 2007. Results show that WRF-Chem performs well in the prediction of SO2, NO2, and PM10, with the prediction accuracy of API index in Shanghai and Nanjing of 50–83% and 80%, respectively. RegAEMS performed well in haze weather forecasting in terms of RH, PM2.5 and visibility. The accuracy rates of Shanghai and Nanjing were 77% and 58%, respectively. The authors developed new classification criteria by taking relative humidity, PM2.5 and visibility as key parameters. Saide et al., applied WRF-Chem model combined with a two-kilometer grid to build a forecasting system to predict PM2.5 concentration for the next one to three days [22]. The test period was from April to August in 2014 and the sample included hourly PM2.5 observations at nine cities in Chile and the United States: Santiago, Rancagua, Curico, Talca, Chillan, Los Angeles, Temuco, Valdivia and Osorno. Empirical results show that the prediction accuracy ranged from 50 to 70%, while the optimal initialization was 61 to 76%.
Delavar et al. established an air pollution prediction model to predict PM10 and PM2.5 concentrations in Tehran [8]. The day of the week, month, topography, meteorology and pollution rates of two neighboring areas were adopted as input parameters for the machine learning methods adopted including SVR (support vector regression), NARX (nonlinear autoregressive exogenous), ANN and GWR (geographically weighted regression). Cross validation was applied on results to evaluate the best method for modeling air pollution predictions. Empirical results show that, SVR, NARX, ANN and GWR can reduce the RMSE of PM10 by 53%, 47%, 47% and 94%; and predict the RMSE of PM2.5 by 58%, 57%, 61% and 94%, respectively. The best prediction method was NARX with external input. Using the proposed prediction model, its RMSE value reached 1.79. In addition, using a genetic algorithm (GA), the authors found that variables such as day of the week, month, topography, wind direction, maximum temperature and pollution rates in two neighboring areas were the most effective parameters for predicting air pollution. Hu et al. used the hourly CO concentration values of four stations, namely, Liverpool, Chullora, Rozelle and Prospect, in Sydney, Australia from May 2009 to May 2016 as samples [23]. Using SVR as the method, CO concentration values were predicted and compared with the prediction results of ANN. Empirical results show that when MAE is used as evaluation index, prediction accuracy of SVR and ANN are 0.314 and 0.435, respectively; when RMSE is used, prediction accuracy of SVR and ANN are 0.414 and 0.677, respectively. In summary, the prediction results of SVR are more accurate than those of ANN.
Davis and Speckman adopted the generalized additive model (GAM) method to establish an air quality prediction system to estimate the next-day maximum and the average ozone concentration over an 8 h period (10 a.m. to 5 p.m.) in Houston [24]. The study collected ozone data from 10 stations in the Houston area from 1983 to 1991, as well as meteorological data at international airports. Data from April to October from 1983 to 1987 and from 1989 to 1990 were used as the training period, and 1988 and 1991 as the forecasting period. Empirical results indicate that wind direction, opaque cloud cover factor, the previous day’s maximum ozone concentration, current day’s maximum temperature and morning mixing depth were all very important variables in the model. In addition, the 8 h prediction results of the average ozone concentration at each station showed that the RMSE ranged from 13.2 to 16.3 ppb (R2 is ranged from 0.66 to 0.73); the prediction results of the maximum average ozone concentration indicated an error range from 18.5 to 22.0 ppb (R2 is ranged from 0.61 to 0.68). Siwek et al. took data collected in southern Warsaw from 2005 to 2007 as a sample to predict the PM10 concentration [25]. Three machine learning networks were adopted: Multilayer Perceptron (MLP), Radial Basis Function (RBF), and SVM. Other models proposed include wavelet-transformed MLP, RBF, and SVM models and a model integrating Blind Source Separation (BSS) and another neural network structure. Empirical results showed that the MAE values of MLP, RBF and SVM models were 6.47, 6.99 and 7.07 μg/m3, respectively, and the MAPE values were 26.43, 28.49 and 27.05%, respectively. After wavelet transformation, the MAE values of the MLP, RBF and SVM models were 4.37, 5.76 and 4.93 μg/m3, respectively, and the MAPE values were 18.04, 23.43 and 20.93%, respectively. The MAE values of the models integrated by BSS and SVM were 3.89 and 4.03 μg/m3, respectively, and the MAPE values were 15.78 and 15.96%, respectively. Results indicate that accuracy of the prediction was improved, and the prediction performance of the model integrated by SVM was the best, which was over 12% higher than the SVM model transformed by wavelet, and higher than the pure RBF model, which was the worst at over 44%.
Sotomayor-Olmedo et al. took monthly air quality monitoring data, including O3, NO2 and PM10, from Mexico City as a sample and applied SVM to predict the air pollution quality of each month in 2009 [26]. Parameters were adjusted through three kernel functions and the performance of prediction results were compared. Empirical results showed that in the prediction of O3 and NO2, the SVM model applying the Gaussian kernel function had higher accuracy. The empirical results also indicated that the prediction accuracy of the three kernel functions was lower in the last couple months of the year, especially in December. As for the prediction of PM10, the Gaussian kernel function mode performed better with a large number of SVMs and the polynomial and spline kernel function modes were relatively accurate with a small number of SVMs.
Song et al. explored a more accurate model in the prediction of power usage load spikes. The study proposed an FKM-ASVM-GARCH ECM model, which integrates GARCH and SVM models, as an alternative model to be compared with traditional FKM-ASVM model which does not include GARCH-modified errors [27]. The study adopted China’s electricity supply as an observation sample, using the daily load capacity from June to July 2014 as the training period of SVM and the daily load capacity in August as the test period. Results indicate that MAPE, the evaluation criterion for prediction accuracy, was reduced from 1.72 in traditional method without GARCH to 0.74 in the alternative GARCH model. Therefore, it was suggested that the FKM-ASVM-GARCH ECM is superior to the FKM-ASVM. Integration of the GARCH model can indeed improve the accuracy of the SVM prediction model. Ishak et al. applied SVR and random forest (RF) models to establish a prediction model for the daily maximum ozone concentration at three monitoring stations in Tunisia, namely, Gabes, Ghazela and Manouba [28]. Using the station data of the National Environmental Protection Agency (ANPE) from 20 June 2014 to 30 September 2014 as the observation sample, 36 explanatory variables, including daily maximum ozone concentration (maxO3) and other pollutant concentrations (SO2, NO2, NO and PM10), were adopted to explain daily maximum ozone concentration. The experimental results showed that prediction performance of the RF model was better than that of the support vector regression model. The RMSE values of the RF model at the Gabes, Ghazela and Manouba stations were 2.26, 4.16, and 6.71, respectively; the MAE values were 1.85, 3.18 and 5.29 and the MAPE values were 4.08, 3.51 and 8.63, respectively. Lin et al. adopted three machine learning methods, including decision tree regression (DTR), gradient boosted tree regression (GBTR) and SVR to predict PM2.5 concentration in the next hour at 67 locations in Taiwan through a big data platform, with RMSE and MAE as the accuracy evaluation criteria [29]. Results showed that the RMSE of DTR, GBTR and SVR methods were 8.52, 5.17 and 4.68, respectively, and the MAE indicators were 6.25, 3.63 and 3.46, respectively. A preliminary conclusion suggests that SVR is considered to be the better prediction model.
Altogether, with recent evolution of quantitative methods, methodology of the air pollution quality prediction in the literature has evolved from traditional Logistic regression analysis to the application of machine-learning-based approaches. The most widely used methods include neural network analysis (ANN) and SVM. The methodology of ANN has evolved from a traditional single-layer input and output to a multi-layer recurrent neural network analysis method (RNN). However, the need for huge volume of (big) data to improve accuracy in RNN method has become a challenge for empirical study with constrained data collection.
Another development trend is the use of hybrid prediction models combined with other methods, such as wavelet analysis, in pursuit of higher prediction accuracy. In terms of SVM methodology, it is moving towards combining other algorithms, such as the combination of the genetic algorithm (GA) model and machine learning methods, such as GA-SVM model. There is still a lack of consensus about which methodology can provide the best prediction accuracy. Finally, the choice of air pollution predictors is also inconclusive among the literature. The intention of this study is to examine predictors and alternative prediction models that integrate the GARCH effect into the GA-SVM model, to improve prediction accuracy for air pollution.

2.2. Dataset

Data for our empirical study was retrieved from the Environmental Protection Agency (EPA) database in Taiwan. The study adopts the central Taiwan region, which has the worst PM2.5 density, as the observed sample. The choice of predictive variables refers to previous literature related in Section 2.1. However, due to the availability of data, nine variables were adopted for our examination: fine particles (PM2.5), carbon monoxide (CO), nitric oxide (NO), nitrogen dioxide (NO2), nitrogen oxide (NOx), ozone (O3), suspended particulate matter (PM10), sulfur dioxide (SO2), wind direction (WindDirection), and wind speed (WindSpeed).
Sampling for this study includes hourly observation data from five air pollution monitoring stations, including FongYen, SaLu, DaLi, ChungMin and SeaTun stations, in central Taiwan. The observation period for our study was from 20 October 2020 to 16 December 2020. The dataset was split into two samples, training data and testing (holdout) data, to evaluate the prediction performance. We held the testing data as the out-of-sample test and used the holdout test to predict hourly PM2.5 concentration in next 8 h in January 2021. In Table 1, we present basic features of the data, including the locations of five monitoring stations, duration and frequency of data and observation numbers in each station. There are three major reasons for choosing the research period from 20 October 2020 to 16 December 2020. First of all, seasonal characteristics are distinct among the four seasons in Taiwan. Especially, the effects of subtropical island weather, temperature and continental air mass on air quality are unique in winter season. The most serious air pollution problems occur around wintertime. Thus, we adopted observation data from this period for examination considering the seasonal effect. Second, in this dataset with over 1200 hourly observations in each station, from a statistical point of view, the mean behavior of the sample is closer to observation data, which makes it more representative for splitting hourly observation data into training data and testing (holdout) data. Third, the data was retrieved from the EPA database in Taiwan. There is redundant overlap in later period observations and clean data is not easy to retrieve, which is why we adopted sample data from the 20 October to 16 December period.

2.3. Methodology

In this study, we attempting to apply machine learning methods to estimate the degree of PM2.5 based on hourly observation data. Due to the influence of terrain, temperature, humidity, wind direction and so on, the PM2.5 in the previous period may not have completely dissipated and will affect the PM2.5 concentration in the next period, thus the phenomena of autoregression and heteroscedasticity, which are common in time series data, might exist in the dataset. Therefore, it is necessary to check the time series features before the prediction. Where the phenomena of autoregression and heteroscedasticity exist, we applied the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model to capture the time series characteristics, and then examined various cross-model integration methods, including integrating the GARCH effect in the GA-SVM model, to establish a method that provides the highest prediction accuracy.
A two-stage approach will be performed in this study. In the first stage, as shown in the upper part of Figure 1, we start with an Augmented Dickey–Fuller (ADF) unit root test for stationary, followed by the GARCH effect diagnosis, including LM test and ARCH test, and end with GARCH model estimation.

2.3.1. Unit Root Test

Granger and Newbold suggested that, if an analysis is carried out with time series data in a non-stationary state, the results will be biased and lead to spurious regression [30]. In such cases, difference should be carried out till the time series data is stationary for further analysis. For the stationary test method, we follow Engle and Granger and adopt the Augmented Dickey–Fuller (ADF) test to check whether the PM2.5 concentration in the dataset is a stationary time series [31]. Depending on the inclusion of intercept or trend, or not, the three variations of ADF test model are:
  • Model with neither intercept nor trend
Δ y t = γ y t 1 + i = 2 p β i Δ y t i + 1 + ε t
2.
Model with intercept but without trend
Δ y t = α 0 + γ y t 1 + i = 2 p β i Δ y t i + 1 + ε t
3.
Model with both intercept and trend
Δ y t = α 0 + γ y t 1 + α 2 t + i = 2 p β i Δ y t i + 1 + ε t
where α 0 is the intercept; t is the trend for time; γ , β i and α 2 are parameters to be estimated; p is the optimal lag order; the residual is ε t ~ i i d ( 0 , σ 2 ) and fit white noise; the null hypothesis is H 0 : γ = 0 . If the test statistics are significant and the null hypothesis is rejected, this time series data does not have a unit root phenomenon and belongs to a stationary time series.

2.3.2. ARCH Test

When the conditional variance of the regression residuals is not uniform, the estimated coefficient is not valid. Therefore, in traditional quantitative empirical analysis, testing whether the model has heterogeneous variance (Heteroscedasticity) has become the main step in diagnosing the model. In empirical analysis, before fitting the GARCH correlation model, it is necessary to check whether the sample time series data has the feature of heterogeneous variation, that is, whether there is an ARCH effect, as the basis for whether the ARCH model can be configured. For the test method, we applied the Lagrange multiplier (LM) test proposed by Engle to test whether the ARCH effect was present [9]. The testing steps were as follows.
(1) We first run the OLS regression to estimate the appropriate mean equation: y t = x t α , where α is the regression coefficient estimated by OLS, and the residual ε t = y t x t α is calculated accordingly, and then save the residual square ε t 2 as another time series.
(2) Regress the residual square estimate ε t 2 on intercept and q lagging terms to calculate the coefficient of determination, R 2 , of this regression analysis. The estimation function is as follows:
ε t 2 = α 0 + α 1 ε t 1 2 + α 2 ε t 2 2 + ........... + α q ε t q 2
(3) Multiply the determined coefficient, R 2 , by the number of samples, T, to calculate the LM test statistic, L M = T × R 2 ~ χ 2 ( q ) , where the LM statistic approaches the chi-square distribution with the degree of freedom q. If the resulting LM test statistic significantly rejects the null hypothesis: H 0   : α 1 = α 2 = ......... = α q = 0 , it means that the time series data inspected has an ARCH effect, and an ARCH or GARCH model should be further fitted.

2.3.3. GARCH Model

If the ARCH effect exists in the hourly PM2.5 concentration data, the conditional heterogeneity variance model will fit. Econometricians have proposed correction methods to improve the heteroscedasticity of time series data. Among them, Engle [9] and Bollerslev [10] are the most popular.
Engle considered the conditional variance to change over time and included it in the autoregressive conditional heterogeneous variance (ARCH) model, allowing the conditional variance to be a function of the squared term of residual in previous period [9]. Thus, previous volatility will affect the subsequent volatility, which is in line with the phenomenon of volatility clustering. The model specification of ARCH(q) can be described as follows:
y t = a x t + ε t h t = σ t 2 = α 0 + α 1 ε t 1 2 + ........ + α p ε t p 2 = α 0 + i = 1 q α i ε t i 2 ε t Ω t 1 ~ N ( 0 , h t )
where α 0 > 0 , α i > 0 , i = 1 , 2 , ........ , q ; y t is the time series data; a x t is the conditional mean of y t ; Ω t 1 is the information collected up to period t − 1 and h t is the conditional heterogeneity variance of y t .
Bollerslev further added the lag conditional variance to the ARCH model, so that ARCH conforms to the traditional ARMA process, which is called the generalized autoregressive conditional heterogeneous variation (GARCH) model [10]. The conditional variance is not only affected by the squared term of residual in previous period, but also by the conditional variance in previous period. The GARCH (p,q) model is stated as follows:
y t = a x t + ε t h t = σ t 2 = α 0 + i = 1 q α i ε t i 2 + j = 1 p β j h t j ε t Ω t 1 ~ N ( 0 , h t )
where α 0 > 0 , α i > 0 , i = 1 , 2 , ........ , q ; β j > 0 , j = 1 , 2 , ........ , p ; y t is the time series data; a x t is the conditional mean of y t ; Ω t 1 is the information collection up to period t − 1; h t is the conditional heterogeneity variance of y t .
We follow the GARCH (1,1) specification since it represents most popular specification based on prior literature [10,11,32,33].
After the GARCH model was estimated, to make sure the GARCH (1,1) model specification is at its optimal level, we applied the ARCH-LM test to check the model fit as well.
The estimation function of ARCH-LM test is as follows:
ε t 2 = α 0 + α 1 ε t 1 2 + α 2 ε t 2 2 + ........... + α q ε t q 2 + v t
Since the SVM model is a regression-based model, when autoregression exists in the dataset, the prediction result will be biased. If the GARCH effect was confirmed in the first stage, in the second stage of analysis, we integrate the GARCH effect into the GA-SVM model with PM2.5 as the prediction variable. We add the PM2.5 in the previous period (PM2.5 (t−1)), the conditional heterogeneity variance in the previous period (ht−1) and the squared term of the residual in the previous period, ε2(t−1), from the GARCH model estimation into the GA-SVM model to establish an alternative PM2.5 prediction model and to compare the prediction accuracy with traditional GA-SVM model which does not take the GARCH effect into consideration.

2.3.4. GA-SVM Model

The Genetic Algorithm (GA) was proposed by John Holland and is based mainly on Darwin’s theory of evolution to simulate the “natural selection” in the evolution of the biological world [34]. The natural elimination mechanism of “survival of the fittest” is widely applied in solving optimization problems, data search, artificial intelligence and machine learning [35]. In the financial field, many scholars have also applied the genetic algorithm to examine various topics, such as: trading systems, stock or portfolio selection, bankruptcy prediction, credit evaluation, budget allocation, etc. [36].
The genetic algorithm mainly operates through three processes: reproduction, crossover and mutation. During the reproduction process, an initial population is randomly generated, and each individual is coded in binary and substituted into a fitness function. Then, based on the obtained fitness value, individuals with high fitness are selected and reproduced to the mating pool. Two individuals are selected randomly in the mating pool for mating each time and the algorithm decides whether the resulting offspring should undergo further mutation. The process of reproduction, crossover and mutation is repeated until the most resilient population is produced. A processing flow chart of the genetic algorithm is shown in Figure 2 [36].
SVM is a learning method widely applied in classification-related topics. SVM was proposed in 1995 by Vladimir Naumovich Vapnik and the AT&T laboratory team [37]. SVM is a machine learning system developed based on the Structural Risk Minimization (SRM) method in statistical learning theory. The main concept of SVM is to use a separating hyperplane to divide data into two or more classes and to deal with the problem of classification in data mining.

2.3.5. Evaluation Indicators for Prediction Models

Regarding the performance evaluation of prediction models, four indicators are generally adopted, including mean-absolute percentage error (MAPE), root mean squared error (RMSE), mean absolute error (MAE) and correlation coefficient (CC) [38]. Specifications of the four indicators are presented in Table 2 based on Witten, Frank and Hall [38]. MAPE is the most commonly used criteria for prediction performance evaluation [36]. We further added RMSE to reinforce our evaluation and will present results in Section 3.3.
Lewis stated that MAPE is often applied to evaluate the predictive ability of a model [39]. The smaller the MAPE, the better the prediction performance. The denominator of percentage termed MAPE is the actual value, thus there is no problem of unstable comparison basis due to the size of the value. When MAPE is the measure, the value of (1-MAPE) represents the accuracy of prediction; thus, the lower the MAPE value, the better the predictive ability. RMSE mainly measures the degree of deviation between the predicted value and the actual value. The degree of deviation is standardized with the actual value of the variable, so the predictive ability of each variable can be compared. When RMSE is the measure, the closer its value is to 0, the better the predictive ability.
In Table 3, we present the interpretation of MAPE values based on Lewis [39]. For example, if the MAPE value is less than 10%, the prediction performance will be classified as “highly accurate forecasting”. When the MAPE value is above 50%, the prediction performance will be classified as “inaccurate forecasting”.

3. Results and Discussion

3.1. GARCH Effect Diagnosis

In Table 4, we present the results of the GARCH effect diagnosis in each of five air pollution monitoring stations including FongYen, SaLu, DaLi, ChungMin and SeaTun stations in Taichung, Taiwan, which are labeled as Stations 1, 2, 3, 4 and 5, respectively, in Table 3. Before the diagnosis, we start with an ADF unit root stationary test, as suggested by Engle and Granger [31], and find that the test statistics in all five stations are significant with probability of chi-square equal to 0.0000. The results significantly reject the null hypothesis of non-stationary at alpha = 0.01 level of confidence, thus the PM2.5 time series data in each station are all stationary and can be used for further estimation.
As shown in Table 4, in all five monitoring stations, F-statistics of the OLS model are all statistically significant at 0.01 level and the adjusted R2 values are at least 0.7773. The coefficient estimates and the t-statistics of each variable at each station are presented in the table. The coefficient estimates of PM2.5 concentration in previous period (PM2.5 (t−1)) at Station 1 to 5 are 0.5408, 0.5828, 0.4945, 0.5563 and 0.5543, respectively; all are statistically significant at 0.01 level. Similar results can be found on the coefficient estimates of PM10 and sulfur dioxide (SO2); both are statistically significant at 0.01 level for all five stations. The coefficient estimates of ozone (O3) were not significant for all five stations.
As we expected, autocorrelation in the PM2.5 data does exist, and the result of the LM test reconfirms such a phenomenon. As shown in the lower part of Table 4, the F-statistics are all significant at the 0.01 level. Test statistics, Obs*R2, in five stations are 31.9702, 40.0022, 41.4075, 42.1737 and 40.6592, respectively, are all significant at the 0.01 level and all significantly reject null hypothesis of no autocorrelation phenomenon. We further performed an ARCH test to investigate whether a GARCH effect existed in our sample data. As shown at the bottom of Table 4, the F-statistics are all significant at 0.01 level. Test statistics, Obs*R2, in five stations are 194.5480, 24.2067, 30.4505, 89.7771 and 132.2895, respectively, and are all significant at 0.01 level, significantly rejecting the null hypothesis of no GARCH effect, implying that the GARCH effect does exist in our sample data.
The results of the LM test and ARCH test indicate that the PM2.5 time series data is autoregressive and does have an ARCH effect. A bias would be expected if we further estimated with a regression-based model, such as the SVM model. Thus, an ARCH or GARCH model should be further fitted.

3.2. GARCH Estimation

In Table 5, we present the GARCH estimation and model specification for each of the five monitoring stations. We follow the GARCH (1,1) specification since it represents the most popular specification, based on prior literature [29,30,31,32]. As shown in the upper section of Table 4, for all five stations, the coefficient estimates of both PM2.5 (t−1), the PM2.5 observation in previous period and PM10 were statistically significant at an alpha level of 0.01. Coefficient estimates of PM2.5 (t−1) at stations one to five were 0.5210, 0.5528, 0.4402, 0.5577 and 0.4786, respectively. Coefficient estimates of PM10 at stations one to five were 0.1634, 0.1635, 0.2327, 0.1755 and 0.2001, respectively. Coefficient estimates of SO2 at stations one to five were 0.2115, 0.5537, 0.6148, 0.2355 and 0.2485, respectively, and all were statistically significant at α = 0.01 level. The coefficient estimates of Ozone (O3) were not significant for all five stations.
We also found that the coefficient estimates of nitric oxide (NO), nitrogen dioxide (NO2), and nitrogen oxide (NOx) were statistically significant at the 0.01, 0.05 and 0.05 levels, respectively, only at station 3 (DaLi) but not significant at the other four stations. Conversely, coefficient estimates of wind speed (WindSpeed) were significant at station 1, 2, 4 and 5 but not significant at station 3 (DaLi). A similar result is found in the coefficient estimates of carbon monoxide (CO), which were significant at Station 1, 2, 4 and 5, all at the 0.01 level, but not significant at station 3 (DaLi).
In the lower section of Table 5, we present the variance estimates of the GARCH model including the squared term of the residual in the previous period (ε2(t−1)) and the conditional heteroscedasticity in previous period (ht−1) at five stations. Coefficient estimates of ε2(t−1) at station one to five are 0.3133, 0.1689, 0.4319, 0.2426 and 0.1741, respectively; all are statistically significant at 0.01 level. Coefficient estimates of ht−1 at stations one to five are 0.5639, 0.8122, 0.2937, 0.6820 and 0.7891, respectively; all are statistically significant at α = 0.01 level.
Empirical results do support our expectation that the PM2.5 observations in the previous period (PM2.5 (t−1)), conditional heteroscedasticity (ht−1) and residual square of GARCH model (ε2(t−1)) are appropriate to be incorporated as prediction variables into a GA-SVM model in our second stage procedure to establish an alternative PM2.5 prediction model. After the GARCH model was estimated, to make sure the GARCH (1,1) model specification was at an optimizal level, we perform an ARCH-LM test to check the model fit as well. Test statistics in all five monitoring stations except Station 3 (DaLi) were significant, rejecting the null hypothesis at 0.01 level, indicating that the GARCH model fit is optimized.

3.3. Evaluations of the Prediction Models

In the second stage of our analysis, we integrate the GARCH effect into the GA-SVM model by adding PM2.5 observations in the previous period (PM2.5 (t−1)), GARCH effect (ht−1) and residual of the GARCH model (ε2(t−1)) to establish an alternative PM2.5 prediction model (GA-SVM-GARCH) and compare the prediction performance of two traditional approaches, the SVM and the GA-SVM model, to our proposed alternative model. The observation period for our study was from 20 October 2020 to 16 December 2020. The dataset was split into two sample sets, training data and testing (holdout) data, to evaluate the prediction performance. We held the testing data as the out-of-sample test and used the holdout test to predict hourly PM2.5 concentration in next 8 h in January 2021.
In Table 6, we present the MAPE and RMSE values of the SVM, GA-SVM and GA-SVM-GARCH models. MAPE is the most commonly used criteria for prediction performance evaluation [35]. The prediction accuracy was calculated by subtracting MAPE value from one. For example, the MAPE values of SVM, GA-SVM and GA-SVM-GARCH model in Station 2 (SaLu) are 35.94%, 33.14% and 0.68%, respectively, indicating prediction accuracies of 64.06%, 66.86% and 99.32%, respectively. The results indicate a significate increase in prediction accuracy with the GA-SVM-GARCH model by over 30% as compared to the traditional SVM and GA-SVM models. We further added RMSE as the second evaluation indicator to reinforce our evaluation. When the RMSE value is closer to 0, the predictive ability of the model is better. For example, in Station 2 (SaLu), RMSE values of SVM, GA-SVM and GA-SVM-GARCH models were 4.7724, 4.4938 and 0.0950, respectively, further proving the better prediction performance with the GA-SVM-GARCH model.
In terms of performance comparison, Song et al. adopted China’s electricity supply as an observation sample and found that the MAPE was reduced from 1.72% in the FKM-ASVM model without GARCH, to 0.74% in the FKM-ASVM-GARCH ECM model with GARCH [27]. In our proposed alternative model, the MAPE value of Station 4 reduced from 26.89% in GA model and 26.64% in GA-SVM model to 0.14% in GA-SVM-GARCH model. The results indicate that integration of the GARCH model can indeed improve the accuracy of the SVM prediction models and our proposed GA-SVM-GARCH model provides the highest accuracy.
In Figure 3 and Figure 4, we present graphical abstracts of MAPE and RMSE comparison of three SVM models above in five monitoring stations. As shown in Figure 3, the MAPE values are within a range from 0.14% to 0.68% in the GA-SVM-GARCH model, as compared to 10.42% to 33.14% in the GA-SVM model and 10.33% to 35.94% in the GA model. In Figure 4, the RMSE values are within a range from 0.0225 to 0.0950 in the GA-SVM-GARCH model, as compared to 3.1561 to 7.3327 in the GA-SVM model and 2.9855 to 7.2913 in the GA model.
As shown in Table 6, Figure 3 and Figure 4, the MAPE and RMSE value of the GA-SVM-GARCH model are the lowest compared to those values in the SVM and GA-SVM models. Overall, the performance of our proposed alternative model outperformed traditional SVM and GA-SVM models in terms of both MAPE and RMSE. When we integrated the GARCH effect into the GA-SVM model, the improvement in predicting accuracy exceeded our expectations.

4. Conclusions

Air pollution, especially that of PM2.5, resulting from industrialization has fostered a wave of global weather migration and jeopardized human health in the past three decades. The prediction and control of air pollution, especially PM2.5, has been a critical issue for regulators, practitioners and academics in Taiwan. Much research has been carried out searching for better prediction models, yet there is no consensus on which is the most accurate approach. In this study we conducted a two-stage analysis and explored whether, by integrating the GA-SVM model with the GARCH effect, we can construct a more accurate air pollution prediction model. The study adopted the region with the worst PM2.5 density, central Taiwan, as the sample source.
In the first stage of analysis, the examination started with an ADF test for stationary in our time series data, followed by an LM test and ARCH test to investigate if autoregression and the ARCH effect existed in our dataset. We further estimated with a GARCH (1,1) model to confirm the existence of a GARCH effect and integrated the GARCH effect into the GA-SVM model with PM2.5 as the predictive variable in the second stage of analysis. Empirical results indicate that the prediction performance of our proposed alternative model outperformed traditional SVM and GA-SVM models in terms of both MAPE and RMSE as the accuracy indicators.
Consistent with previous SVM literature [7,23,26,27], which shows a trend of integrating various approaches into the SVM model, our empirical results provide evidence to support our expectation that adopting an SVM-based approach model for PM2.5 prediction is appropriate and that prediction performance can be improved by integrating models, such as incorporate the GARCH effect into the GA-SVM-based approach. Moreover, consistent with our prior expectation, evidence further support that taking the GARCH effect into account, in the GA-SVM model, clearly improves the accuracy of prediction. To the knowledge of the authors, this study is the first to attempt to integrate the GARCH effect into the GA-SVM model in the prediction of PM2.5. It is possible to compare the empirical results of prediction in this study to findings in the recent PM2.5 literature, although different methods were adopted. Studies on the Taichung region by Chen et al. [2] and Kristiani et al. [19], cross country (Kusuma et al. [3]), and sub-district in China (Long et al. [5]) are comparable.
In summary, this study has implications for sustainability management by both government and industry. As long as a regression-based approach is adopted, ignoring possible autoregressive characteristic in time series dataset tends to result in lower prediction efficiency and accuracy. Furthermore, if variance clustering exists in a time series dataset, the choice of prediction model should account for this phenomenon. It is highly recommended to take the GARCH effect into consideration in air pollution prediction to capture variance clustering and to improve prediction efficiency and accuracy. Moreover, this study may shed light on the application of the GARCH model, as well as machine learning methods, in the air pollution prediction literature.

Author Contributions

Conceptualization, K.-C.Y., H.-W.H., M.-H.H. and T.-C.W.; methodology, formal analysis, K.-C.Y., H.-W.H., M.-H.H. and T.-C.W.; resources, writing—original draft preparation, K.-C.Y., H.-W.H., M.-H.H. and T.-C.W.; writing—review and editing, K.-C.Y., H.-W.H., M.-H.H. and T.-C.W. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to acknowledge that the APC was funded by Ministry of Science and Technology, Taiwan (MOST 109-2511-H-018-018-MY3).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are available on request.

Acknowledgments

The authors would like to express our deep appreciation for the insightful comments offered by two anonymous reviewers to enhance the quality and readability of our paper. The authors would also like to acknowledge the financial support provided by Ministry of Science and Technology, Taiwan (MOST 109-2511-H-018-018-MY3).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wikipedia. Air Pollution in Taiwan. Available online: https://en.wikipedia.org/wiki/Air_pollution_in_Taiwan (accessed on 27 February 2022).
  2. Chen, H.L.; Li, C.P.; Tang, C.S.; Lung, S.C.C.; Chuang, H.C.; Chou, D.W.; Chang, L.T. Risk assessment for people exposed to PM2.5 and constituents at different vertical heights in an urban area of Taiwan. Atmosphere 2020, 11, 1145. [Google Scholar] [CrossRef]
  3. Kusuma, W.L.; Wu, C.D.; Zeng, Y.T.; Hapsari, H.H.; Muhamad, J.L. PM2.5 pollutant in Asia—A comparison of metropolis cities in Indonesia and Taiwan. Int. J. Environ. Res. Public Health 2019, 16, 4924. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Xie, M.; Zhou, X. Taiwan Air Pollution Problem and Prevention Policy, National Policy Research Foundation. Available online: https://www.npf.org.tw/2/18414 (accessed on 27 February 2022).
  5. Long, Y.; Wang, J.; Wu, K.; Zhang, J. Population exposure to ambient PM2.5 at the subdistrict level in China. Int. J. Environ. Res. Public Health 2018, 15, 2683. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Chou, K.T. From anti-pollution to climate change risk movement: Reshaping civic epistemology. Sustainability 2015, 7, 14574–14596. [Google Scholar] [CrossRef] [Green Version]
  7. Feng, X.; Li, Q.; Zhu, Y.; Hou, J.; Jin, L.; Wang, J. Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 2015, 107, 118–128. [Google Scholar] [CrossRef]
  8. Delavar, M.R.; Gholami, A.G.; Shiran, R.; Rashidi, Y.; Nakhaeizadeh, G.; Fedra, R.K.; Afshar, S.H. A novel method for improving air pollution prediction based on machine learning approaches: A case study applied to the capital city of Tehran. Int. J. Geo-Inf. 2019, 8, 99. [Google Scholar] [CrossRef] [Green Version]
  9. Engle, R.F. Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 1982, 50, 987–1007. [Google Scholar] [CrossRef]
  10. Bollerslev, T. Generalized autoregressive conditional heteroscedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef] [Green Version]
  11. Hansen, P.R.; Lunde, A. A forecast comparison of volatility models: Does anything beat a garch(1,1)? J. Appl. Econom. 2005, 20, 873–889. [Google Scholar] [CrossRef] [Green Version]
  12. Zickus, M.; Greig, A.J.; Niranjan, M. Comparison of four machine learning methods for predicting PM10 concentrations in Helsinki, Finland. Water Air Soil Pollut. Focus 2002, 2, 717–729. [Google Scholar] [CrossRef]
  13. Dudot, A.L.; Rynkiewicz, J.; Steiner, F.E.; Rude, J. A 24-h forecast of ozone peaks and exceedance levels using neural classifiers and weather predictions. Environ. Model. Softw. 2007, 22, 1261–1269. [Google Scholar] [CrossRef] [Green Version]
  14. Voukantsis, D.; Karatzas, K.; Kukkonen, J.; Räsänen, T.; Karppinen, A.; Kolehmainen, M. Intercomparison of air quality data using principal component analysis, and forecasting of PM10 and PM2.5 concentrations using artificial neural networks, in Thessaloniki and Helsinki. Sci. Total Environ. 2011, 409, 1266–1276. [Google Scholar] [CrossRef] [PubMed]
  15. Russo, A.; Raischel, F.; Lind, P.G. Air quality prediction using optimal neural networks with stochastic variables. Atmos. Environ. 2013, 79, 822–830. [Google Scholar] [CrossRef] [Green Version]
  16. Tamas, W.; Notton, G.; Paoli, C.; Voyant, C.; Nivet, M.L.; Balu, A. Urban ozone concentration forecasting with artificial Neural Network in Corsica. Math. Model. Civil. Eng. 2014, 10, 29–37. [Google Scholar] [CrossRef] [Green Version]
  17. Zhou, Q.; Jiang, H.; Wang, J.; Zhou, J. A hybrid model for PM2.5 forecasting based on ensemble empirical mode decomposition and a general regression neural network. Sci. Total Environ. 2014, 496, 264–274. [Google Scholar] [CrossRef] [PubMed]
  18. Elangasinghe, M.A.; Singhal, N.; Dirks, K.N.; Salmond, J.A.; Samarasinghe, S. Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and kmeans clustering. Atmos. Environ. 2014, 94, 106–116. [Google Scholar] [CrossRef]
  19. Kristiani, E.; Lin, H.; Lin, J.R.; Chuang, Y.H.; Huang, C.Y.; Yang, C.T. Short-term prediction of PM2.5 using LSTM deep learning methods. Sustainability 2022, 14, 2068. [Google Scholar] [CrossRef]
  20. Zheng, Y.; Yi, X.; Li, M.; Li, R.; Shan, Z.; Chang, E.; Li, T. Forecasting fine-grained air quality based on big data. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 2267–2276. [Google Scholar]
  21. Wang, T.; Jiang, F.; Deng, J.; Shen, Y.; Fu, Q.; Wang, Q.; Fu, Y.; Xu, J.; Zhang, D. Urban air quality and regional haze weather forecast for Yangtze River Delta region. Atmos. Environ. 2012, 58, 70–83. [Google Scholar] [CrossRef]
  22. Saide, P.E.; Mena-Carrasco, M.; Tolvett, S.; Hernandez, P.; Carmichael, G.R. Air quality forecasting for winter-time PM2.5 episodes occurring in multiple cities in central and southern Chile. J. Geophys. Res. Atmos. 2016, 121, 558–575. [Google Scholar] [CrossRef] [Green Version]
  23. Hu, K.; Sivaraman, V.; Bhrugubanda, H.; Kang, S.; Rahman, A. SVR based dense air pollution estimation model using static and wireless sensor network. In Proceedings of the 2016 IEEE Sensors, Orlando, FL, USA, 30 October–3 November 2016; pp. 1–3. [Google Scholar] [CrossRef]
  24. Davis, J.M.; Speckman, P. A model for predicting maximum and 8 hr average ozone in Houston. Atmos. Environ. 1999, 33, 2487–2500. [Google Scholar] [CrossRef]
  25. Siwek, K.; Osowski, S.; Garanty, K.; Sowinski, M. Ensemble of predictors for forecasting the PM10 pollution. In Proceedings of the VXV International Symposium on Theoretical Engineering, Lübeck, Germany, 22–24 June 2009; pp. 1–5. [Google Scholar]
  26. Sotomayor-Olmedo, A.; Aceves-Fernandez, M.A.; Gorrostieta-Hurtado, E.; Pedraza-Ortega, C.; Ramos-Arreguin, J.M.; Vargas-Soto, J.E. Forecast urban air pollution in Mexico City by using support vector machines: A kernel performance approach. Int. J. Intell. Sci. 2013, 3, 126–135. [Google Scholar] [CrossRef] [Green Version]
  27. Song, Z.; Niu, D.; Xiao, X.; Wu, H. Daily peak load forecasting based on fast k-medoids clustering, GARCH error correction and SVM model. J. Appl. Sci. Eng. 2016, 19, 249–258. [Google Scholar]
  28. Ishak, A.B.; Daoud, M.B.; Trabelsi, A. Ozone concentration forecasting using statistical learning approaches. J. Mater. Environ. Sci. 2017, 8, 4532–4543. [Google Scholar]
  29. Lin, K.M.; Chang, Y.S.; Zeng, Y.R.; Huang, C.X. Air pollution forecasting using machine learning methods on big data platform. In Proceedings of the TANET—Taiwan Academic NETwork Conference, Taoyuan, Taiwan, 20–24 October 2018; pp. 740–745. [Google Scholar]
  30. Granger, C.W.J.; Newbold, P. Spurious regressions in econometrics. J. Econom. 1974, 2, 111–120. [Google Scholar] [CrossRef] [Green Version]
  31. Engle, R.F.; Granger, C.W.J. Co-integration and error correction: Representation, estimation and testing. Econometrica 1987, 55, 251–276. [Google Scholar] [CrossRef]
  32. Christoffersen, P.F.; Diebold, F.X. Further results on forecasting and model selection under asymmetric loss. J. Appl. Econom. 1996, 11, 561–571. [Google Scholar] [CrossRef]
  33. Gerlach, R.; Chen, C.W.S.; Lin, D.S.Y.; Huang, M.H. Asymmetric responses of international stock markets to trading. Phys. A Stat. Mech. Its Appl. 2006, 360, 422–444. [Google Scholar] [CrossRef]
  34. Holland, J.H. Adaptation in Natural and Artificial Systems; The University of Michigan Press: Ann Arbor, MI, USA, 1975. [Google Scholar]
  35. Goldberg, D.E.; Korb, B.; Deb, K. Messy genetic algorithms: Motivation, analysis, and first results. Complex Syst. 1989, 3, 493–530. [Google Scholar]
  36. Shin, K.S.; Lee, Y.J. A genetic algorithm application in bankruptcy prediction modeling. Expert Syst. Appl. 2002, 23, 321–328. [Google Scholar] [CrossRef]
  37. Vapnik, V. Estimation of Dependences Based on Empirical Data; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006; pp. 430–438. [Google Scholar]
  38. Witten, I.H.; Mark, E.F.; Hall, A. Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed.; Morgan Kaufmann: Burlington, MA, USA, 2011; p. 180. [Google Scholar]
  39. Lewis, C.D. Industrial and Business Forecasting Methods; Butterworths: London, UK, 1982; p. 40. [Google Scholar]
Figure 1. The Analysis Process Flow Chart.
Figure 1. The Analysis Process Flow Chart.
Sustainability 14 04459 g001
Figure 2. Processing Flow Chart of the Genetic Algorithm (GA).
Figure 2. Processing Flow Chart of the Genetic Algorithm (GA).
Sustainability 14 04459 g002
Figure 3. MAPE comparison of three SVM models in five monitoring stations.
Figure 3. MAPE comparison of three SVM models in five monitoring stations.
Sustainability 14 04459 g003
Figure 4. RMSE comparison of three SVM models in five monitoring stations.
Figure 4. RMSE comparison of three SVM models in five monitoring stations.
Sustainability 14 04459 g004
Table 1. Basic features of observation data.
Table 1. Basic features of observation data.
Station 1Station 2Station 3Station 4Station 5
LocationFongYenSaLuDaLiChungMinSeaTun
Duration58 days58 days58 days58 days58 days
FrequencyHourlyHourlyHourlyHourlyHourly
Observations12111211121512111211
Note: Stations 1, 2, 3, 4 and 5 represent monitoring stations in the counties of FongYen, SaLu, DaLi, ChungMin and SeaTun, respectively. Data was retrieved from the EPA database in Taiwan. The observation period was from 20 October 2020 to 16 December 2020.
Table 2. Performance Measures for Numeric Prediction.
Table 2. Performance Measures for Numeric Prediction.
Mean absolute percentage error
(MAPE)
100 % n × ρ 1 a 1 a 1 + + ρ n a n a n
Root mean squared error
(RMSE)
( ρ 1 a 1 ) 2 + + ( ρ n a n ) 2 n
Mean absolute error
(MAE)
ρ 1 a 1 + + ρ n a n n
Correlation coefficient
(C. C.)
S P A S P S A      S P A = i ( p i p ¯ ) ( a i a ¯ ) n 1
S P = i ( p i p ¯ ) 2 n 1    S A = i ( a i a ¯ ) 2 n 1
Here, a ¯ is the mean value over the test data.
Table 3. Interpretation of MAPE values.
Table 3. Interpretation of MAPE values.
MAPEIntepretation
MAPE < 10%Highly accurate forecasting
10% <MAPE < 20%Good forecasting
20% <MAPE < 50%Reasonable forecasting
50% <MAPEInaccuracte forecasting
Table 4. Results of GARCH Effect Diagnosis.
Table 4. Results of GARCH Effect Diagnosis.
VariableStation 1Station 2Station 3Station 4Station 5
PM2.5 (t−1)0.54080.58280.49450.55630.5543
25.80 ***31.70 ***24.83 ***30.10 ***28.47 ***
CO−0.98420.5406−0.1548−0.22330.9440
−1.941.14−0.47−0.511.70
NO1.2822−0.0410−0.00460.23240.2220
1.55−0.56−0.010.470.34
NO21.25680.1181−0.12770.29020.1788
1.542.19 **−0.230.590.27
NOX−1.0688−0.01390.1713−0.2036−0.1715
−1.31−0.310.30−0.42−0.26
O3−0.01020.00900.00680.00300.0016
−0.820.750.510.250.13
PM100.15470.17340.19700.19270.1801
13.28 ***16.86 ***18.01 ***17.60 ***16.73 ***
SO20.46120.57790.69340.56270.3856
9.84 ***9.84 ***11.05 ***8.72 ***6.07 ***
WindDirection0.00040.00120.0019−0.0003−0.0026
0.301.301.82−0.42−2.52 **
WindSpeed0.3763−0.1501−0.0007−0.4826−0.5618
−2.15 **−1.86*−0.09−3.14 ***−5.11 ***
C0.8794−1.3351−1.2469−0.19982.0251
1.71 *−2.75 ***−3.56 ***−0.393.43 ***
Observation12111211121512111211
F-statistic423.2258739.5768710.4244740.6729602.5055
Prob. (F-stat.)0.00000.00000.00000.00000.0000
Adj. R20.77730.85920.85320.85940.8325
LM Test
F-statistic16.242320.462321.201921.613220.8101
Prob. (F-stat.)0.00000.00000.00000.00000.0000
Obs*R231.970240.002241.407542.1736940.6592
Prob. (Chi2)0.00000.00000.00000.00000.0000
ARCH test
F-statistic231.437824.660031.182596.8117148.2826
Prob. (F-stat.)0.00000.00000.00000.00000.0000
Obs*R2194.548024.206730.450589.7771132.2895
Prob. (Chi2)0.00000.00000.00000.00000.0000
Note: Test statistics of ADF test in five stations are significant with the chi-square probability equal to 0.0000; thus, the data in each station does not have a unit root phenomenon and belongs to a stationary time series. Stations 1, 2, 3, 4 and 5 represent monitoring station in the counties of FongYen, SaLu, DaLi, ChungMin and SeaTun, respectively. Variables examined include fine-particle observation in previous period (PM2.5 (t−1)), carbon monoxide (CO), nitric oxide (NO), nitrogen dioxide (NO2), nitrogen oxide (NOx), ozone (O3), suspended particulate matter (PM10), sulfur dioxide (SO2), wind direction (WindDirection), and wind speed (WindSpeed). t-statistic values are presented below each coefficient estimates. ***, ** and * indicate statistical significance at 1%, 5% and 10%, respectively.
Table 5. Result of GARCH (1,1) Estimation.
Table 5. Result of GARCH (1,1) Estimation.
VariableStation 1Station 2Station 3Station 4Station 5
GARCH (1,1)
PM2.5 (t−1)0.52100.55280.44020.55770.4786
27.31 ***33.80 ***31.17 ***41.47 ***28.22 ***
CO2.90162.41740.53192.54393.3907
6.28 ***8.16 ***1.479.13 ***5.75 ***
NO0.9557−0.05130.69520.14740.1618
1.38−0.342.68 ***0.360.21
NO21.04880.07790.57100.17660.1172
1.530.532.24 **0.440.16
NOx−0.9170−0.0024−0.5645−0.1312-0.1266
−1.33−0.02−2.21 **−0.32-0.17
O3−0.0025−0.0064−0.01530.00640.0060
−0.28−0.64−1.260.750.58
PM100.16340.16350.23270.17550.2001
18.24 ***26.84 ***31.04 ***29.07 ***26.42 ***
SO20.21150.55370.61480.23550.2485
5.64 ***16.23 ***9.04 ***5.42 ***3.71 ***
WindDirection−0.0012−0.0006−0.0001−0.0002−0.0024
−1.28−0.95−0.10−0.28−3.40 ***
WindSpeed−0.3804−0.21100.0008−0.2507−0.4646
−2.73 ***−3.40 ***0.07−2.02 **−5.44 ***
C1.1559−0.6811−0.4225−0.03851.8648
2.76 ***−1.66 *−1.69 *−0.104.10 ***
Variance Equation
2.80010.58457.16521.38380.9128
C7.03 ***6.24 ***9.53 ***5.47 ***4.01 ***
0.31330.16890.43190.24260.1741
ε2(t−1)8.81 ***8.53 ***10.44 ***8.46 ***10.03 ***
0.56390.81220.29370.68200.7891
h(t−1)14.08 ***44.93 ***5.72 ***20.11 ***35.78 ***
Note: Stations 1, 2, 3, 4 and 5 represent monitoring stations in the counties of FongYen, SaLu, DaLi, ChungMin and SeaTun, respectively. Variables examined include fine particle observation in the previous period (PM2.5 (t−1)), carbon monoxide (CO), nitric oxide (NO), nitrogen dioxide (NO2), nitrogen oxide (NOx), ozone (O3), suspended particulate matter (PM10), sulfur dioxide (SO2), wind direction (WindDirection) and wind speed (WindSpeed). t-statistic values are presented below each coefficient estimates. ***, ** and * indicate statistical significance at 1%, 5% and 10%, respectively.
Table 6. Performance of Alternative Prediction Models.
Table 6. Performance of Alternative Prediction Models.
ModelStation 1Station 2Station 3Station 4Station 5
SVM
MAPE10.33%35.94%21.64%26.89%16.08%
RMSE3.94054.72242.98557.29134.2309
GA-SVM
MAPE10.42%33.14%25.72%26.64%15.88%
RMSE4.07784.49383.15617.33274.2144
GA-SVM-GARCH
MAPE0.32%0.68%0.61%0.14%0.14%
RMSE0.05960.09500.06320.03130.0255
Note: Stations 1, 2, 3, 4 and 5 represent monitoring stations in the counties of FongYen, SaLu, DaLi, ChungMin and SeaTun, respectively.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Yao, K.-C.; Hsueh, H.-W.; Huang, M.-H.; Wu, T.-C. The Role of GARCH Effect on the Prediction of Air Pollution. Sustainability 2022, 14, 4459. https://doi.org/10.3390/su14084459

AMA Style

Yao K-C, Hsueh H-W, Huang M-H, Wu T-C. The Role of GARCH Effect on the Prediction of Air Pollution. Sustainability. 2022; 14(8):4459. https://doi.org/10.3390/su14084459

Chicago/Turabian Style

Yao, Kai-Chao, Hsiu-Wen Hsueh, Ming-Hsiang Huang, and Tsung-Che Wu. 2022. "The Role of GARCH Effect on the Prediction of Air Pollution" Sustainability 14, no. 8: 4459. https://doi.org/10.3390/su14084459

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop