# Predicting Daily Air Pollution Index Based on Fuzzy Time Series Markov Chain Model

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Preliminary

#### Fuzzy Time-Series Definitions

**Definition**

**1.**

**Definition**

**2.**

**Definition**

**3.**

## 3. Methodology

#### 3.1. Study Area and Dataset

_{3}), sulphur dioxide (SO

_{2}), particulate matter (PM

_{10}), carbon monoxide (CO

_{2}), and nitrogen dioxide (NO

_{2}), as shown in Figure 1 [42,43,44,45]. The API values are determined by the average indices for these five pollutant variables, and then the maximum value from these sub-indices is selected as the API value [3]. In Malaysia, the air pollution index (API) has been adopted as a measure of air pollution conditions. The API is a simple number that ranges from 0 to ∞ to reflect the air quality levels that are related to the health effects [3,45].

^{2}, as shown in Figure 2. The API dataset is divided into a training dataset, which is from the 1 January 2012 to 31 December 2013 and testing dataset, which is from the1 January 2014 to 31 December 2014. The values of API recorded at the selected monitoring station are provided by the Department of Environment of Malaysia. The total number of observations in this study is 1096. The value of API of less than 100 denotes a good air quality, while a of API greater than 100 indicates a higher degree of air pollution. The classification of states is made based on the breakpoints for API of 50, 100, 200, 300, and 300+, corresponding to good, moderate, unhealthy, very unhealthy, and hazardous states, respectively, as shown in Table 1 [3,45].

#### 3.2. Proposed Model

**Step****1.**- Define the universe of discourse ($U$) from the available time-series data, by using the formula $U\text{}=\text{}\left[{D}_{min}-\text{}{D}_{1},\text{}{D}_{max}+\text{}{D}_{2}\right]$, where ${D}_{min}$ and ${D}_{max}$ denote the minimum and the maximum value in the universe of discourse $U$ respectively, ${D}_{1}$ and ${D}_{2}$ represent positive values.
**Step****2.**- Partition U for the observed data using the grid partition method [21,29] based on a different number of partitions, which are 5, 6, 7, 8, …, 50. But to avoid the redundancy, we present only 5, 10, 15, 20, 25, 30, 35, 40, and 45 numbers of partitions to determine the optimal partition number of partitions of the universe that can improve the model accuracy.
**Step****3.**- Define the fuzzy sets A
_{i}on $U$ using the following equation$${A}_{i}=\frac{{f}_{{A}_{i}}\left({u}_{1}\right)}{{u}_{1}}+\frac{{f}_{{A}_{i}}\left({u}_{2}\right)}{{u}_{2}}+\dots +\frac{{f}_{{A}_{i}}\left({u}_{n}\right)}{{u}_{n}}$$ **Step****4.**- Fuzzify the observations into fuzzy numbers based on the maximum membership value.
**Step****5.**- Construct the fuzzy logical relationships (FLRs) and establish fuzzy logical relation groups (FLRGs) to build frequencies (count) matrix of fuzzy relation between observations.
**Step****6.**- Generate the Markov weights (transition probability matrix) based on the frequencies of the established (FLRGs) in Step 5. The total number of states is $n$ according to the total number of fuzzy sets. Thus, the matrix $P\text{}$is ${P}_{n\times n}$. State transition probability ${P}_{ij}$, from state ${A}_{i}$ to state ${A}_{j}$. In other words, ${P}_{ij}$ is the probability of observing ${y}_{t+1}$ given ${y}_{t}$, i.e., ${P}_{ij}=Pr\left({y}_{t+1}=j|\text{}{y}_{t}=i\right)$, which can be calculated as follows$${P}_{ij}=\frac{{N}_{ij}}{Ni.}\text{},i,j=1,2,\dots ,n$$$$P=\left[\begin{array}{ccc}\begin{array}{cc}\begin{array}{c}{p}_{11}\\ {p}_{21}\end{array}& \begin{array}{c}{p}_{12}\\ {p}_{22}\end{array}\end{array}& \cdots & \begin{array}{c}{p}_{1n}\\ {p}_{2n}\end{array}\\ \vdots & \ddots & \vdots \\ \begin{array}{cc}{p}_{n1}& {p}_{n2}\end{array}& \cdots & {p}_{nn}\end{array}\right]$$
**Step****7.**- Calculate the forecasted values. The following rules are considered in calculating the forecasts.

**Step****8.**- Adjust the forecasted values by adding the differences of actual values $Y\left(t\right)$, which can adjust the forecasted values to reduce the estimated error. The adjusted forecasted values can be written by$$\widehat{F}\text{}\left(t+1\right)=F\text{}\left(t+1\right)+\text{}\mathrm{diff}\text{}(Y\left(t\right))$$
**Step****9.**- Validate the model.

#### 3.3. Model Validation

## 4. The Implementation of the Algorithm

**Step****1.**- Define $U$ for the APIs values. $U=\left[{D}_{min}-{D}_{1},\text{}{D}_{max}+{D}_{2}\right]$$$U=\left[25-5,\text{}495+5\right]\phantom{\rule{0ex}{0ex}}U=\left[20,\text{}500\right]$$
**Step****2.**- Partitioning $U$ based on different numbers of partitions from 1 to 50. However, to prevent the redundancy where it will be too long, we have only mentioned numbers 5, 10, 15, …, 30 to present the partitioning as shown in Figure 5.
**Step****3.**- Fuzzy sets are defined. Fuzzy sets ${A}_{k}$ are determined based on the intervals ${u}_{k}\text{}$that already have formed using the grid method in the previous step with the function membership. Then, the fuzzy sets ${A}_{k}$ can be written as follows using Equation (2). Table 1 reveals the fuzzy sets $\text{}{A}_{i}$, ($i=1,\text{}2,\text{}\mathrm{\u2025},\text{}n$). The greater the value of $i\text{}$indicates that the fuzzy set of API values will move from the lowest to the highest fuzzy set of API values.
**Step****4.**- Transform APIs values into fizzy numbers and find the fuzzy logic relationships (FLRs), as shown in Table 2.

**Step****5.**- Fuzzy logical relationships (FLRs) are determined, and frequencies (count) matrix of fuzzy relation between observations are determined. This step shows the FLRGs can be grouped into the fuzzy logic relationship groups (FLRGs).

**Step****6.**- Assign the Markov weights based on the matrix of frequencies from Step 5 by using Equation (4), as shown in Table 4. Then, transition process diagram could be established using the weights to visualize the Markov weighted Matrix.

**Step****7.**- Calculate the forecasted values by using Equation (5) or (6) based on Markov weights. For example, the forecast value for the day (2012/1/2) is calculated by using Equation (6).
**Step****8.**- The forecasted values are adjusted by using Equation (7). For example, in Step 7, we have found the forecast value is 56.66.

## 5. Model Evaluation

#### 5.1. Fitting the Optimal partItion Number of the Universe of Discourse

#### 5.2. Model’s Validation

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

Symbol/Abbreviation | Description |
---|---|

${\mathit{A}}_{\mathit{i}}$ | Fuzzy set |

$\mathit{U}$ | Universe of discourse |

${\mathit{D}}_{\mathit{m}\mathit{i}\mathit{n}}$ | The minimum value in the universe of discourse $U$ |

${\mathit{D}}_{\mathit{m}\mathit{a}\mathit{x}}$ | The maximum value in the universe of discourse $U$ |

${\mathit{D}}_{1}$ | Positive value |

${\mathit{D}}_{2}$ | Positive value |

${\mathit{f}}_{{\mathit{A}}_{\mathit{i}}}$ | Membership function of fuzzy set |

${\mathit{u}}_{\mathit{i}}$ | Linguistic intervals |

$\mathit{F}\left(\mathit{t}\right)$ | Fuzzy time series at time t |

FLR | Fuzzy logical relationships |

FLRGs | Fuzzy logical relationship groups |

${\mathit{m}}_{\mathit{k}}$ | Midpoints of the linguistic intervals ${u}_{i}$ |

${\mathit{P}}_{\mathit{i}\mathit{j}}$ | Transition probability |

${\mathit{N}}_{\mathit{i}\mathit{j}}$ | Number of transitions |

${\mathit{N}}_{\mathit{i}.}$ | Total number of transitions |

$\mathit{P}$ | Transition probability matrix |

$\mathit{Y}\left(\mathit{t}\right)$ | Actual value |

diff ($\mathit{Y}\left(\mathit{t}\right))$ | The difference in actual values |

FTS | Fuzzy time series |

FTSMC | Fuzzy time series Markov chain |

ARIMA | Autoregressive integrated moving average |

ANN | Artificial neuron network |

SO2 | Sulphur dioxide |

O3 | Ozone |

PM_{10} | Particulate matter |

CO_{2} | Carbon monoxide |

NO_{2} | Nitrogen dioxide |

API | Air pollution index |

Thiels’ U | Thiels’ U statistic. |

RMSE | Root mean square error |

MAPE | Mean absolute percentage error |

FTS | Fuzzy time series model proposed by Song |

CFTS | Fuzzy time series model proposed by Chen |

HOFTS | High order fuzzy time series model proposed by Severiano et al. |

TWFTS | Trend weighted fuzzy time series model proposed by Cheng |

$AIC$ | Akaike information criteria |

$BIC$ | Bayesian information criteria |

SARIMA | Seasonal autoregressive integrated moving average |

ARMA | Autoregressive moving average |

GARCH | General autoregressive conditional heteroskedasticity |

ARCH | Autoregressive conditional heteroskedasticity |

$\mathit{M}\mathit{S}\mathit{E}$ | Mean square error |

$\mathit{L}$ | The maximum value of the likelihood function |

ACF | Autocorrelation function |

PACF | Partial autocorrelation function |

**Table A2.**Statistical criteria for fitting the best partition number of the FTSMC model using the training dataset.

Partitions | RMSE | MAPE | Theils U |
---|---|---|---|

5 | 31.41 | 40.69 | 1.63 |

6 | 26.38 | 32.10 | 1.37 |

7 | 23.55 | 27.38 | 1.22 |

8 | 17.38 | 20.41 | 0.90 |

9 | 19.97 | 20.20 | 1.04 |

10 | 17.08 | 20.80 | 0.89 |

11 | 16.10 | 19.81 | 0.84 |

12 | 18.93 | 19.37 | 0.98 |

13 | 13.85 | 17.44 | 0.72 |

14 | 17.35 | 16.77 | 0.90 |

15 | 13.25 | 15.80 | 0.69 |

16 | 14.54 | 15.71 | 0.76 |

17 | 14.43 | 14.97 | 0.75 |

18 | 14.24 | 14.54 | 0.74 |

19 | 13.95 | 14.42 | 0.72 |

20 | 13.83 | 14.19 | 0.72 |

21 | 12.79 | 14.13 | 0.66 |

22 | 12.68 | 14.04 | 0.66 |

23 | 12.55 | 14.02 | 0.65 |

24 | 12.26 | 13.91 | 0.64 |

25 | 12.41 | 14.32 | 0.64 |

26 | 12.65 | 14.46 | 0.66 |

27 | 14.03 | 14.43 | 0.73 |

28 | 13.63 | 14.13 | 0.71 |

29 | 12.13 | 13.69 | 0.63 |

30 | 11.44 | 13.15 | 0.59 |

31 | 12.25 | 13.40 | 0.64 |

32 | 11.91 | 13.06 | 0.62 |

33 | 11.98 | 13.10 | 0.62 |

34 | 11.77 | 13.20 | 0.61 |

35 | 12.30 | 13.22 | 0.64 |

36 | 11.91 | 13.33 | 0.62 |

37 | 11.99 | 13.50 | 0.62 |

38 | 11.68 | 13.26 | 0.61 |

39 | 11.63 | 13.21 | 0.60 |

40 | 11.89 | 13.21 | 0.62 |

41 | 11.80 | 13.35 | 0.61 |

42 | 11.70 | 13.37 | 0.61 |

43 | 11.50 | 13.37 | 0.60 |

44 | 11.50 | 13.20 | 0.60 |

45 | 11.80 | 13.21 | 0.61 |

46 | 12.31 | 13.02 | 0.61 |

47 | 11.51 | 13.21 | 0.60 |

48 | 11.56 | 13.14 | 0.60 |

49 | 11.62 | 12.99 | 0.60 |

## References

- Wang, L.; Wang, J.; Tan, X.; Fang, C. Analysis of NOx Pollution Characteristics in the Atmospheric Environment in Changchun City. Atmosphere
**2020**, 11, 30. [Google Scholar] [CrossRef] [Green Version] - Kumar, A.; Goyal, P. Forecasting of Daily Air Quality Index in Delhi. Sci. Total Environ.
**2001**, 409, 5517–5523. [Google Scholar] [CrossRef] [PubMed] - Alyousifi, Y.; Masseran, N.; Ibrahim, K. Modeling the stochastic dependence of air pollution index data. Stoch. Environ. Res. Risk Assess.
**2018**, 32, 1603–1611. [Google Scholar] [CrossRef] - Rahman, N.H.A.; Lee, M.H.; Suhartono, M.T.L. Evaluation performance of time series approach for forecasting air pollution index in Johor, Malaysia. Sains Malays.
**2016**, 45, 1625–1633. [Google Scholar] - Box, G.E.P.; Jenkins, G.M. Time Series Analysis: Forecasting and Control, 1st ed.; Holden-Day: San Francesco, CA, USA, 1976. [Google Scholar]
- David, G.S.; Rizol, P.M.S.R.; Nascimento, L.F.C. Fuzzy computational models to evaluate the effects of air pollution on children. Rev. Paul. De Pediatr.
**2018**, 36, 10–16. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Elangasinghe, M.A.; Singhal, N.; Dirks, K.N.; Salmond, J.A. Development of an ANN-based air pollution forecasting system with explicit knowledge through sensitivity analysis. Atmos. Pollut. Res.
**2014**, 5, 696–708. [Google Scholar] [CrossRef] [Green Version] - Rahman, N.H.A.; Lee, M.H.; Latif, M.T. Artificial neural networks and fuzzy time series forecasting: An application to air quality. Qual. Quant.
**2015**, 49, 2633–2647. [Google Scholar] [CrossRef] - Bernard, F. Fuzzy environmental Decision-making: Applications to Air Pollution. Atmos. Environ.
**2003**, 37, 1865–1877. [Google Scholar] - Heo, J.-S.; Kim, D.-S. A New Method of Ozone Forecasting Using Fuzzy Expert and Neural Network Systems. Sci. Total Environ.
**2004**, 325, 221–237. [Google Scholar] [CrossRef] - Morabito, F.C.; Versaci, M. Fuzzy Neural Identification and Forecasting Techniques to Process Experimental Urban Air Pollution Data. Neural Netw.
**2003**, 16, 493–506. [Google Scholar] [CrossRef] - Dincer, N.G.; Akkuş, Ö. A new fuzzy time series model based on robust clustering for forecasting of air pollution. Ecol. Inform.
**2018**, 43, 157–164. [Google Scholar] [CrossRef] - Aripin, A.; Suryono, S.; Bayu, S. Web based prediction of pollutant PM10 concentration using Ruey Chyn Tsaur fuzzy time series model. In Proceedings of the 2016 Conference on Fundamental and Applied Science for Advanced Technology (Confast 2016), Yogyakarta, Indonesia, 25–26 January 2016; pp. 20–46. [Google Scholar]
- Hong, W.A.; Man, J.I.; Yili, T.A. Air Quality Index Forecast Based on Fuzzy Time Series Models. J. Residuals Sci. Technol.
**2016**, 13. [Google Scholar] - Mishra, D.; Goyal, P. Neuro-fuzzy approach to forecast NO
_{2}pollutants addressed to air quality dispersion model over Delhi, India. Aerosol Air Qual. Res.**2016**, 16, 166–174. [Google Scholar] [CrossRef] [Green Version] - Darmawan, D.; Irawan, M.I.; Syafei, A.D. Data Driven Analysis using Fuzzy Time Series for Air Quality Management in Surabaya. Sustinere J. Environ. Sustain.
**2017**, 1, 57–73. [Google Scholar] [CrossRef] [Green Version] - Cheng, C.H.; Huang, S.F.; Teoh, H.J. Predicting daily ozone concentration maxima using fuzzy time series based on a two-stage linguistic partition method. Comput. Math. Appl.
**2011**, 62, 2016–2028. [Google Scholar] [CrossRef] [Green Version] - Song, Q.; Chissom, B.S. Forecasting enrollments with fuzzy time series-Part I. Fuzzy Sets Syst.
**1993**, 54, 1–10. [Google Scholar] [CrossRef] - Song, Q.; Chissom, B.S. Forecasting enrollments with fuzzy time series-Part II. Fuzzy Sets Syst.
**1994**, 54, 1–10. [Google Scholar] [CrossRef] - Zadeh, L.A. Fuzzy sets. Inf. Control.
**1965**, 8, 338–353. [Google Scholar] [CrossRef] [Green Version] - Chen, S.M. Forecasting enrollments based on fuzzy time series. Fuzzy Sets Syst.
**1996**, 81, 311–319. [Google Scholar] [CrossRef] - Huarng, K. Effective lengths of intervals to improve forecasting in fuzzy time series. Fuzzy Sets Syst.
**2011**, 123, 387–394. [Google Scholar] [CrossRef] - Huarng, K.; Yu, T.H.-K. Ratio-based lengths of intervals to improve fuzyy time series forecasting. Ieee Trans. Syst. Man Cybern. Part B Cybern.
**2006**, 36, 328–340. [Google Scholar] [CrossRef] [PubMed] - Yolcu, U.A. new approach based on optimization of ratio for seasonal fuzzy time series. Iranian J. Fuzzy Syst.
**2016**, 13, 19–36. [Google Scholar] - Yu, H.-K. Weighted fuzzy time series models for TAIEX forecasting. Physica A: Stat. Mech. Appl.
**2005**, 349, 609–624. [Google Scholar] [CrossRef] - Cheng, C.H.; Chen, T.L.; Teoh, H.J.; Chiang, C.H. Fuzzy time series based on adaptive expectation model for TAIEX forecasting. Expert Syst. Appl.
**2008**, 34, 1126–1132. [Google Scholar] [CrossRef] - Efendi, R.; Ismail, Z.; Deris, M.M. Improved weight Fuzzy Time Series as used in the exchange rates forecasting of US Dollar to Ringgit Malaysia. Int. J. Comput. Intell. Appl.
**2013**, 12, 13–29. [Google Scholar] [CrossRef] - Tsaur, R.C. A fuzzy time series-Markov chain model with an application to forecast the exchange rate between the Taiwan and US dolar. Int. J. Innov. Comput. Inf. Control.
**2012**, 8, 1349–4198. [Google Scholar] - Sadaei, H.J.; Enayatifar, R.; Abdullah, A.H.; Gani, A. Short-term load forecasting using a hybrid model with a refined exponentially weighted fuzzy time series and an improved harmony search. Inte. J. Elec. P. & Ene. Syst.
**2014**, 62, 118–129. [Google Scholar] - Egrioglu, E.; Aladag, C.H.; Başaran, M.A.; Uslu, V.R.; Yolcu, U. A new approach based on the optimization of the length of intervals in fuzzy time series. J. Intell. Fuzzy Syst.
**2011**, 22, 15–19. [Google Scholar] [CrossRef] - Chen, M.Y.; Chen, B.T. A hybrid fuzzy time series model based on granular computing for stock price forecasting. Info.Sci.
**2018**, 294, 227–241. [Google Scholar] [CrossRef] - Talarposhti, F.M.; Sadaei, H.J.; Enayatifar, R.; Guimarães, F.G.; Mahmud, M.; Eslami, T. Stock market forecasting by using a hybrid model of exponential fuzzy time series. Inter. J. Appro. Reas.
**2019**, 70, 79–98. [Google Scholar] [CrossRef] - Cheng, C.H.; Yang, J.H. Fuzzy time-series model based on rough set rule induction for forecasting stock price. Neurocomputing
**2018**, 302, 33–45. [Google Scholar] [CrossRef] - Rahim, N.F.; Othman, M.; Sokkalingam, R.; Abdul Kadir, E. Type 2 Fuzzy Inference-Based Time Series Model. Symmetry
**2019**, 11, 1340. [Google Scholar] [CrossRef] [Green Version] - Bose, M.; Mali, K. A novel data partitioning and rule selection technique for modeling high-order fuzzy time series. Applied Soft Computing
**2018**, 63, 87–96. [Google Scholar] [CrossRef] - Zuo, K.T.; Chen, L.P.; Zhang, Y.Q.; Yang, J. Manufacturing-and machining-based topology optimization. Inter. J. adv. Manu. Tech.
**2006**, 27, 531–536. [Google Scholar] [CrossRef] - Ning, J.; Nguyen, V.; Huang, Y.; Hartwig, K.T.; Liang, S.Y. Inverse determination of Johnson–Cook model constants of ultra-fine-grained titanium based on chip formation model and iterative gradient search. Inter. J. Adv. Manu. Tech.
**2018**, 99, 1131–1140. [Google Scholar] [CrossRef] - Ning, J.; Liang, S.Y. Inverse identification of Johnson-Cook material constants based on modified chip formation model and iterative gradient search using temperature and force measurements. Inter. J. Adv. Manu. Tech.
**2019**, 102, 2865–2876. [Google Scholar] [CrossRef] - Koo, J.W.; Wong, S.W.; Selvachandran, G.; Long, H.V. Prediction of Air Pollution Index in Kuala Lumpur using fuzzy time series and statistical models. Air Quality, Atmosphere & Health.
**2020**, 75, 107–111. [Google Scholar] - Wang, J.; Li, H.; Lu, H. Application of a novel early warning system based on fuzzy time series in urban air quality forecasting in China. Applied Soft Computing.
**2018**, 71, 783–799. [Google Scholar] [CrossRef] - Yang, H.; Zhu, Z.; Li, C.; Li, R. A novel combined forecasting system for air pollutants concentration based on fuzzy theory and optimization of aggregation weight. Applied Soft Computing
**2019**, 87, 105972. [Google Scholar] [CrossRef] - DOE Air Quality. Available online: https://www.doe.gov.my/portalv1/en/info-umum/kuality-udara/114 (accessed on 10 April 2019).
- DOE Air Pollution Index of Malaysia. Available online: http://apims.doe.gov.my (accessed on 7 January 2020).
- DOE Air Quality Standards. Available online: https://www.doe.gov.my/portalv1/en/info-umum/english-airquality-trend/108 (accessed on 31 January 2020).
- Alyousifi, Y.; Ibrahim, K.; Kang, W.; Zin, W.Z.W. Markov chain modeling for air pollution index based on maximum a posteriori method. Air Quality, Atmosphere & Health
**2019**, 1–11. [Google Scholar] - Silva, P.C.d.L.; Lucas, P.O.; Sadaei, H.J.; Guimarães, F.G. pyFTS: Fuzzy Time Series for Python.
**2018**. [Google Scholar] [CrossRef] - Cheng, C.H.; Chen, T.L.; Chiang, C.H. Trend-Weighted Fuzzy Time-Series Model for TAIEX Forecasting Neural Information Processing. In International Conference on Neural Information Processing; Springer: Berlin/Heidelberg, Germany, 2006; Volume 42, pp. 469–477. [Google Scholar]
- Severiano, C.A.; Silva, P.C.; Sadaei, H.J.; Guimarães, F.G. Very short-term solar forecasting using fuzzy time series. In Proceedings of the 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE), Naples, Italy, 9–12 July 2017; pp. 1–6. [Google Scholar]
- Syafei, A.D. Applying exponential state space smoothing model to short term prediction of NO2. Jurnal Teknologi.
**2015**, 9–75. [Google Scholar] [CrossRef] [Green Version] - Lee, M.H.; Rahman, N.; Suhartono, S.; Latif, M.T.; Nor, M.; Kamisan, N. Seasonal ARIMA for forecasting air pollution index: A case study. Am. J. Appl. Sci.
**2012**, 9, 570–578. [Google Scholar] - Pahlavani, M.; Roshan, R. The comparison among ARIMA and hybrid ARIMA-GARCH models in forecasting the exchange rate of Iran. Inter. J. Busi. Dev. Stu.
**2015**, 7, 31–50. [Google Scholar] - Tseng, F.M.; Tzeng, G.H.; Yu, H.C.; Yuan, B.J. Fuzzy ARIMA model for forecasting the foreign exchange market. Fuzzy Sets Syst.
**2001**, 118, 9–19. [Google Scholar] [CrossRef] - Akaike, H. A new look at the statistical model identification. Autom Control IEEE Trans.
**1974**, 19, 716–723. [Google Scholar] [CrossRef] - Konishi, S.; Kitagawa, G. Bayesian information criteria. In Information criteria and statistical modeling; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008; pp. 211–237. [Google Scholar]

**Figure 6.**The FTSMC model using the grid method with different numbers of partitions using a training dataset.

**Figure 7.**The FTSMC model based on the grid method with the best partition number using a training dataset.

State | Range of APIs | Air Quality Status | Health Consequences |
---|---|---|---|

1 | [0, 50] | Good | Low pollution without any bad effect on health |

2 | (50, 100] | Moderate | Moderate pollution that does not pose any bad effect on health |

3 | (100, 200] | Unhealthy | Worsens the health condition of high-risk people that have heart and lung complications |

4 | (200, 300] | Very Unhealthy | Affects public health. Worsens the health condition and low tolerance of physical exercises for people with heart and lung complications |

5 | (300, $\infty )$ | Hazardous | Hazardous to high-risk people and public health |

No | $\mathbf{FTS}\text{}\mathbf{Values}\text{}\mathit{A}\mathit{i}$ |
---|---|

1 | $A0\text{}=\frac{1}{{u}_{1}}+\frac{0.5}{{u}_{2}}+\frac{0}{{u}_{3}}+\frac{0}{{u}_{4}}+\dots +\frac{0}{{u}_{27}}+\frac{0}{{u}_{28}}+\frac{0}{{u}_{29}}$ |

2 | $A1\text{}=\frac{0.5}{{u}_{1}}+\frac{1}{{u}_{2}}+\frac{0.5}{{u}_{3}}+\frac{0}{{u}_{4}}+\dots +\frac{0}{{u}_{27}}+\frac{0}{{u}_{28}}+\frac{0}{{u}_{29}}$ |

3 | $A2\text{}=\frac{0}{{u}_{1}}+\frac{0.5}{{u}_{2}}+\frac{1}{{u}_{3}}+\frac{0.5}{{u}_{4}}+\dots +\frac{0}{{u}_{27}}+\frac{0}{{u}_{28}}+\frac{0}{{u}_{29}}$ |

. | |

. | |

. | |

29 | $A28\text{}=\frac{0}{{u}_{1}}+\frac{0}{{u}_{2}}+\frac{0}{{u}_{3}}+\frac{0}{{u}_{4}}+\dots \frac{0.5}{{u}_{3}}+\frac{1}{{u}_{27}}+\frac{0.5}{{u}_{28}}+\frac{0}{{u}_{29}}$ |

30 | $A29\text{}=\frac{0.5}{{u}_{1}}+\frac{1}{{u}_{2}}+\frac{0.5}{{u}_{3}}+\frac{0}{{u}_{4}}+\dots +\frac{0}{{u}_{27}}+\frac{0.5}{{u}_{28}}+\frac{1}{{u}_{29}}$ |

N | Date | API | Fuzzy Number | Fuzzy Set Relationships |
---|---|---|---|---|

1 | 2012/1/1 | 51 | A0 | - |

2 | 2012/1/2 | 81 | A2 | A0 $\to $ A2 |

3 | 2012/1/3 | 65 | A1 | A2 $\to $ A1 |

4 | 2012/1/4 | 70 | A1 | A1 $\to $ A1 |

5 | 2012/1/5 | 66 | A1 | A1 $\to $ A1 |

6 | 2012/1/6 | 65 | A1 | A1 $\to $ A1 |

7 | 2012/1/7 | 98 | A3 | A1 $\to $ A3 |

. | . | . | . | . |

. | . | . | . | . |

. | . | . | . | . |

732 | 2013/12/29 | 50 | A0 | A0 $\to $ A1 |

733 | 2013/12/30 | 61 | A1 | A1 $\to $ A1 |

734 | 2013/12/31 | 70 | A1 | A1 $\to $ A1 |

Group | Fuzzy Logical Relationships (FLRs) |
---|---|

G1 | A0$\to \text{}$ (4) A0, (4) A1, (1) A3 |

G2 | A1$\to \text{}$ (3) A0, (125) A1, (65) A2, (10) A3, (1) A4 |

G3 | A2$\to $ (2) A0, (70) A1, (248) A2, (36) A3, (3) A4, (4) A5 |

G4 | A3$\to $ A1, A2, A3, A4, A5 |

G5 | A4$\to $ (2) A1, (2) A2, (11) A3, (10) A4, (1) A6, (1) A7 |

G6 | A5$\to $ (2) A2, (2) A3, (3) A4, A5, A8 |

G7 | A6$\to $ (1) A4, (1) A5, (1) A6 |

G8 | A7$\to \text{}$ (1) A12 |

G9 | A8$\to $ (2) A6, (2) A8 |

G10 | A12$\to $ (1) A14 |

G11 | A14$\to $ (1) A3 |

G12 | A26$\to $ (1) A27 |

G13 | A27$\to $ (1) A14 |

Markov Weights Elements for Each Group |
---|

A0$\to $ A0(4/9), A1(4/9), A3(1/9) |

A1$\to $ A0(1/68), A1(125/204), A2(65/204), A3(5/102), A4(1/204) |

A2$\to $ A0(2/363), A1(70/363), 2(248/363), A3(12/121), A4(1/121), A5(4/363) |

A3$\to $ A1(2/107), A2(47/107), 3(47/107), A4(10/107), A5(10/107) |

A4$\to $ A1(2/27), A2(2/27), A3(11/27), A4(10/27), A6(1/27), (1/27) |

A5$\to $ A2 (2/9), A3 (2/9), A4 (1/9), A5(1/3), A8 (1/9) |

A6$\to $ A4 (1/3), A5 (1/3), A6 (1/3) |

A7$\to $ A12 (1) |

A8$\to $ A6 (1/2), A8 (1/2) |

A12$\to $ A14 (1) |

A14$\to $ A3 (1) |

A26$\to $ A27 (1) |

A27$\to $ A14(1) |

**Table 6.**Statistical criteria for fitting the best partition number of the FTSMC model using the training dataset.

N. Partitions | RMSE | MAPE | Theils U |
---|---|---|---|

5 | 31.41 | 40.69 | 1.63 |

10 | 17.08 | 20.80 | 0.89 |

15 | 13.25 | 15.80 | 0.69 |

20 | 13.83 | 14.19 | 0.72 |

25 | 12.41 | 14.32 | 0.64 |

30 | 11.44 | 13.15 | 0.59 |

35 | 12.30 | 13.22 | 0.64 |

40 | 11.89 | 13.21 | 0.62 |

45 | 11.80 | 13.21 | 0.61 |

Model | RMSE | MAPE | Theils U |
---|---|---|---|

FTS [18] | 27.94 | 19.45 | 1.45 |

CFTS [21] | 15.08 | 22.85 | 0.78 |

TWFTS [47] | 12.84 | 14.28 | 0.62 |

HOFTS [48] | 28.65 | 32.96 | 1.31 |

FTSMC The proposed model | 11.44 | 13.15 | 0.59 |

Model | RMSE | MAPE | Theils U |
---|---|---|---|

FTS [18] | 46.80 | 61.24 | 2.07 |

CFTS [21] | 24.27 | 36.63 | 1.26 |

TWFTS [47] | 18.06 | 18.88 | 0.89 |

HOFTS [48] | 28.65 | 42.96 | 1.49 |

FTSMC The proposed model | 17.01 | 17.32 | 0.80 |

Prediction Model | AIC | BIC | Ranking |
---|---|---|---|

ARMA | 9389.56 | 9425.39 | 6 |

ARIMA | 9380.53 | 9415.82 | 4 |

Markov chain | 9381.23 | 9418.71 | 3 |

ARCH | 12,213.47 | 12,249.33 | 7 |

GARCH | 12,225.42 | 12,261.05 | 8 |

SARIMA | 9385.91 | 9421.28 | 5 |

Fuzzy-ARIMA | 9379.94 | 9313.98 | 2 |

Exponential smoothing | 12,942.58 | 12,977.24 | 9 |

FTSMC The proposed model | 9368.14 | 9406.46 | 1 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Alyousifi, Y.; Othman, M.; Sokkalingam, R.; Faye, I.; Silva, P.C.L.
Predicting Daily Air Pollution Index Based on Fuzzy Time Series Markov Chain Model. *Symmetry* **2020**, *12*, 293.
https://doi.org/10.3390/sym12020293

**AMA Style**

Alyousifi Y, Othman M, Sokkalingam R, Faye I, Silva PCL.
Predicting Daily Air Pollution Index Based on Fuzzy Time Series Markov Chain Model. *Symmetry*. 2020; 12(2):293.
https://doi.org/10.3390/sym12020293

**Chicago/Turabian Style**

Alyousifi, Yousif, Mahmod Othman, Rajalingam Sokkalingam, Ibrahima Faye, and Petronio C. L. Silva.
2020. "Predicting Daily Air Pollution Index Based on Fuzzy Time Series Markov Chain Model" *Symmetry* 12, no. 2: 293.
https://doi.org/10.3390/sym12020293