GSTARI-X-ARCH Model with Data Mining Approach for Forecasting Climate in West Java
Abstract
:1. Introduction and Background
2. Materials and Methods
2.1. Non-Stationary Univariate Time Series Analysis
2.2. Distance Inverse Weight Matrix
2.3. Spatiotemporal Model Based on Box–Jenkins Method
2.4. Space–Time Autoregressive Autocorrelation Function (STACF) and Space–Time Partial Autocorrelation Function (STPACF)
2.5. Autoregressive Conditional Heteroscedasticity (ARCH) Model
2.6. The Development of the GSTARI-X-ARCH Model
2.7. Diagnostic Checking
2.8. Mean Absolute Percentage Error (MAPE)
2.9. Knowledge Discovery in Database (KDD) Data Mining
3. Results
3.1. Pre-Processing Step
- The source of data comes from NASA POWER. It can be accessed via https://power.larc.nasa.gov/data-access-viewer/ (accessed on 25 May 2022). Furthermore, we can obtain information of the data storage size from the website https://disc.gsfc.nasa.gov/ (accessed on 25 May 2022). The data storage size is 3,370,469 TB with 100 variables. Climate data with rainfall and humidity parameters in the West Java region, consisting of 27 regencies/cities, are calculated from December 1989 to 2021 with daily time intervals. Meanwhile, each location’s latitude and longitude coordinate data were obtained from https://www.latlong.net/ (accessed on 25 May 2022).
- The data cleaning process is the preliminary stage when forecasting climate phenomena using the GSTARI-X-ARCH model with a data mining approach. Daily climate data for 27 districts and cities in West Java consists of 11,719 data. The rainfall is a response variable, and humidity is an exogenous variable. Then, the daily data are aggregated into monthly data using R software. The data aggregate process resulted in 385 monthly data for each location. In this study, the data selection process is conducted by selecting rainfall data that occur in the wet months, namely December, January, and February (DJF). At this stage, it produces 97 data for each location. Figure 3 shows the brief pre-processing steps with the selected places representing the same observation value of 11 locations as input in the data mining process.
3.2. Data Mining Process
3.3. Post-Processing Step
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix B
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Bandung
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Pangandaran
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Bekasi
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Bogor
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Cirebon
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Tasikmalaya
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Majalengka
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Indramayu
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Purwakarta
- GSTARI-X(1,1,1)-ARCH(1) model to forecast climate phenomena in Kuningan
References
- Ferreira, G.W.S.; Reboita, M.S.; Drumond, A. Evaluation of ECMWF-SEAS5 Seasonal Temperature and Precipitation Predictions over South America. Climate 2022, 10, 128. [Google Scholar] [CrossRef]
- Deisenhammer, E.A. Weather and Suicide: The Present State of Knowledge on the Association of Meteorological Factors with Suicidal Behaviour. Acta Psychiatr. Scand. 2003, 108, 402–409. [Google Scholar] [CrossRef]
- Alvi, S.; Khayyam, U. Mitigating and Adapting to Climate Change: Attitudinal and Behavioural Challenges in South Asia. Int. J. Clim. Change Strateg. Manag. 2020, 12, 477–493. [Google Scholar] [CrossRef]
- Tan, Y.; Qian, L.; Sarkar, A.; Nurgazina, Z.; Ali, U. Farmer’s Adoption Tendency towards Drought Shock, Risk-Taking Networks and Modern Irrigation Technology: Evidence from Zhangye, Gansu, PRC. Int. J. Clim. Chang. Strateg. Manag. 2020, 12, 431–448. [Google Scholar] [CrossRef]
- Espinoza-Molina, J.; Acosta-Caipa, K.; Chambe-Vega, E.; Huayna, G.; Pino-Vargas, E.; Abad, J. Spatiotemporal Analysis of Urban Heat Islands in Relation to Urban Development, in the Vicinity of the Atacama Desert. Climate 2022, 10, 87. [Google Scholar] [CrossRef]
- Baranowski, P.; Krzyszczak, J. Spatial and Temporal Assessment of Remotely Sensed Land Surface Temperature Variability in Afghanistan During 2000–2021. Climate 2022, 10, 111. [Google Scholar]
- Capotondi, A. Extreme La Niña Events to Increase. Nat. Clim. Chang. 2015, 5, 100–101. [Google Scholar] [CrossRef]
- Hidayat, R.; Juniarti, M.D.; Ma’Rufah, U. Impact of La Niña and La Niña Modoki on Indonesia Rainfall Variability. IOP Conf. Ser. Earth Environ. Sci. 2018, 149, 012046. [Google Scholar] [CrossRef]
- Supriatin, L.S.; Martono, M. Impacts of Climate Change (El Nino, La Nina, and Sea Level) on the Coastal Area of Cilacap Regency. Forum Geogr. 2016, 30, 106. [Google Scholar] [CrossRef] [Green Version]
- Wei, W.W.S. Time Series Analysis: Univariate and Multivariate Methods; Addison-Wesley: Boston, MA, USA, 2006; pp. 1–614. [Google Scholar]
- Wei, W.W.S. Multivariate Time Series Analysis and Applications; John Wiley & Sons Ltd.: New York, NY, USA, 2019; ISBN 9781119502951. [Google Scholar]
- Box, G.E.; Jenkins, G. Time Series Analysis Forecasting and Control; John Wiley & Sons Ltd.: New York, NY, USA, 1976. [Google Scholar]
- Pfeifer, P.E.; Deutsch, S.J. A STARIMA Model-Building Procedure with Application to Description and Regional Forecasting. Trans. Inst. Br. Geogr. 1980, 5, 330–349. [Google Scholar] [CrossRef]
- Pfeifer, P.E.; Deutsch, S.J. A Three-Stage Iterative Procedure for Space-Time Modeling a Three-Stage Iterative Procedure for Space-Time Modeling Space-Time Modeling STARIMA STAR STMA Time Series Modeling Three-Stage Model Building Procedure. Technometrics 1980, 22, 35–47. [Google Scholar] [CrossRef]
- Borovkova, S.; Lopuhaä, R.; Ruchjana, B.N. Generalized STAR Model with Experimental Weights. In Statistical Modelling in Society Stat. Model, Proceedings of the 17th International Workshop on Statistical Modelling. Part II Contrib. Papers Posters, Chania, Crete, 8–12 July 2002; Elsevier: Amsterdam, The Netherlands, 2002; pp. 143–151. [Google Scholar]
- Borovkova, S.; Lopuhaä, H.P.; Ruchjana, B.N. Consistency and Asymptotic Normality of Least Squares Estimators in Generalized STAR Models. Stat. Neerl. 2008, 62, 482–508. [Google Scholar] [CrossRef]
- Nainggolan, N.; Ruchjana, B.N.; Darwis, S.; Siregar, R.E. Gstar Models with ARCH Errors and the Simulation. In Proceedings of the Third International Conference on Mathematics and Natural Sciences, Almería, Spain, 26–30 June 2010; pp. 1075–1084. [Google Scholar]
- Bonar, H.; Ruchjana, B.N.; Darmawan, G. Development of Generalized Space Time Autoregressive Integrated with ARCH Error (GSTARI-ARCH) Model Based on Consumer Price Index Phenomenon at Several Cities in North Sumatera Province. AIP Conf. Proc. 2017, 1827, 020009. [Google Scholar] [CrossRef] [Green Version]
- Alawiyah, M.; Kusuma, D.A.; Ruchjana, B.N. Gstari-Arch Model and Application on Positive Confirmed Data for COVID-19 in West Java. Media Stat. 2022, 14, 146–157. [Google Scholar] [CrossRef]
- Mukhaiyar, U.; Ramadhani, S. The Generalized STAR Modeling with Heteroscedastic Effects. Cauchy 2022, 7, 158–172. [Google Scholar] [CrossRef]
- Iriany, A.; Suhariningsih; Ruchjana, B.N.; Setiawan. Prediction of Precipitation Data at Batu Town Using The. J. Basic Appl. Sci. Res. 2013, 3, 860–865. [Google Scholar]
- Di Giacinto, V. A Generalized Space-Time ARMA Model with an Application to Regional Unemployment Analysis in Italy. Int. Reg. Sci. Rev. 2006, 29, 159–198. [Google Scholar] [CrossRef]
- Min, X.; Hu, J.; Zhang, Z. Urban Traffic Network Modeling and Short-Term Traffic Flow Forecasting Based on GSTARIMA Model. In Proceedings of the 13th International IEEE Conference on Intelligent Transportation Systems, Funchal, Portugal, 19–22 September 2010; pp. 1535–1540. [Google Scholar] [CrossRef]
- Nisak, S.C. Seemingly Unrelated Regression Approach for GSTARIMA Model to Forecast Rain Fall Data in Malang Southern Region Districts. Cauchy 2016, 4, 57. [Google Scholar] [CrossRef] [Green Version]
- Akbar, M.S.; Setiawan; Suhartono; Ruchjana, B.N.; Prastyo, D.D.; Muhaimin, A.; Setyowati, E. A Generalized Space-Time Autoregressive Moving Average (GSTARMA) Model for Forecasting Air Pollutant in Surabaya. J. Phys. Conf. Ser. 2020, 1490, 012022. [Google Scholar] [CrossRef]
- Elfiyan, I.; Ruchjana, B.N.; Bachrudin, A. GSTARI Model Approach by Involving Exogenous Variables to Predict Active Family Planning Participants. Proc. Unpad Stat. Natl. Semin. 2015, 5, 410–423. [Google Scholar]
- Suhartono; Wahyuningrum, S.R.; Setiawan; Akbar, M.S. GSTARX-GLS Model for Spatio-Temporal Data Forecasting. Malays. J. Math. Sci. 2016, 10, 91–103. [Google Scholar]
- Aulia, N.; Saputro, D.R.S. Generalized Space Time Autoregressive Integrated Moving Average with Exogenous (GSTARIMA-X) Models. IOP Conf. Ser. Earth Environ. Sci. 2021, 1808, 012052. [Google Scholar] [CrossRef]
- Ashari, A.; Efendi, A.; Pramoedyo, H. GSTARX-SUR Modeling Using Inverse Distance Weighted Matrix and Queen Contiguity Weighted Matrix for Forecasting Cocoa Black Pod Attack in Trenggalek Regency. In Proceedings of the 13th International Interdisciplinary Studies Seminar, Malang, Indonesia, 30–31 October 2019. [Google Scholar] [CrossRef]
- Abdullah, S. Spatial Data Mining Using the Sar-Kriging Model. Indones. J. Comput. Cybern. Syst. 2011, 5, 52–61. [Google Scholar]
- Munandar, D.; Ruchjana, B.N.; Abdullah, A.S. Principal Component Analysis-Vector Autoregressive Integrated (PCA-VARI) Model Using Data Mining Approach to Climate Data in the West Java Region. BAREKENG J. Ilmu Mat. Terap. 2022, 16, 99–112. [Google Scholar] [CrossRef]
- Monika, P.; Ruchjana, B.N.; Abdullah, A.S. The Implementation of the ARIMA-ARCH Model Using Data Mining for Forecasting Rainfall in Ban- Dung City. Int. J. Data Netw. Sci. 2022, 6, 1309–1318. [Google Scholar] [CrossRef]
- Wang, Y.; Kockelman, K.M.; Wang, X.C. The Impact of Weight Matrices on Parameter Estimation and Inference: A Case Study of Binary Response Using Land-Use Data. J. Transp. Land Use 2013, 6, 75–85. [Google Scholar] [CrossRef] [Green Version]
- Engle, R.F. Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica 1982, 50, 987. [Google Scholar] [CrossRef]
- Ling, S.; McAleer, M. Asymptotic Theory for a Vector ARMA-GARCH Model. Econom. Theory 2003, 19, 280–310. [Google Scholar] [CrossRef] [Green Version]
- Bollerslev, T. Modelling the Coherence in Short-Run Nominal Exchange Rates: A Multivariate Generalized Arch Model. Rev. Econ. Stat. 1990, 72, 498–505. [Google Scholar] [CrossRef]
- Box, G.E.P.; Pierce, D.A. Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models. J. Am. Stat. Assoc. 1970, 65, 1509–1526. [Google Scholar] [CrossRef]
- Ljung, G.M.; Box, G.E.P. On a Measure of Lack of Fit in Time Series Models. Biometrika 1978, 65, 297–303. [Google Scholar] [CrossRef]
- Johnson, R.A.; Wichern, D.W. Applied Multivariate Statistical Analysis, 5th ed.; Prenticee-Hall, Inc.: Hoboken, NJ, USA, 2002; ISBN 0-13-092553-5. [Google Scholar]
- Astaiza-Gómez, J.G. Lagrange Multiplier Tests in Applied Research. J. Ciencia Ing. 2020, 12, 13–19. [Google Scholar] [CrossRef]
- Catani, P.S.; Ahlgren, N.J.C. Combined Lagrange Multiplier Test for ARCH in Vector Autoregressive Models. Econom. Stat. 2017, 1, 62–84. [Google Scholar] [CrossRef]
- Sjölander, P. A Stationary Unbiased Finite Sample ARCH-LM Test Procedure. Appl. Econ. 2011, 43, 1019–1033. [Google Scholar] [CrossRef] [Green Version]
- Lawrence, K. Fundamentals of Forecasting Using Excel; Industrial Press, Inc.: New York, NY, USA, 2009; ISBN 083113335X. [Google Scholar]
- Ishwarappa; Anuradha, J. A Brief Introduction on Big Data 5Vs Characteristics and Hadoop Technology. Procedia Comput. Sci. 2015, 48, 319–324. [Google Scholar] [CrossRef] [Green Version]
- Olaiya, F.; Adeyemo, A.B. Application of Data Mining Techniques in Weather Prediction and Climate Change Studies. Int. J. Inf. Eng. Electron. Bus. 2012, 4, 51–59. [Google Scholar] [CrossRef] [Green Version]
- Bracco, A.; Falasca, F.; Nenes, A.; Fountalis, I.; Dovrolis, C. Advancing Climate Science with Knowledge-Discovery through Data Mining. NPJ Clim. Atmos. Sci. 2018, 1, 1–6. [Google Scholar] [CrossRef] [Green Version]
- Peplow, A.; Thomas, J.; Alshehhi, A. Noise Annoyance in the UAE: A Twitter Case Study via a Data-Mining Approach. Int. J. Environ. Res. Public Health 2021, 18, 2198. [Google Scholar] [CrossRef]
- Palacios, C.A.; Reyes-Suárez, J.A.; Bearzotti, L.A.; Leiva, V.; Marchant, C. Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile. Entropy 2021, 23, 485. [Google Scholar] [CrossRef]
- Han, J.; Kamber, M.; Pei, J. Data Mining Concepts and Techniques, 3rd ed.; Elsevier Inc.: Cambridge, MA, USA, 2012; Volume 1, ISBN 9780123814791. [Google Scholar]
- Larose, D.T. Discovering Knowledge in Data: An Introdcution to Data Mining; John Wiley & Sons, Inc.: Toronto, ON, Canada, 2005; ISBN 9786468600. [Google Scholar]
- Tan, P.-N.; Steinbach, M.; Kumar, V. Introduction to Data Mining; Pearson: New York, NY, USA, 2006. [Google Scholar]
- Schroeder, L.; Veronez, M.R.; de Souza, E.M.; Brum, D.; Gonzaga, L.; Rofatto, V.F. Respiratory Diseases, Malaria and Leishmaniasis: Temporal and Spatial Association with Fire Occurrences from Knowledge Discovery and Data Mining. Int. J. Environ. Res. Public Health 2020, 17, 3718. [Google Scholar] [CrossRef]
Location | Rainfall Data | Humidity Data | ||||||
---|---|---|---|---|---|---|---|---|
Response Variable | Before Differencing | After Differencing | Exogenous Variable | -Value | Conc. | |||
-Value | Conc. | -Value | Conc. | |||||
Bandung | 0.1967 | NS | 0.01 | S | 0.01 | S | ||
Pangandaran | 0.02321 | S | 0.01 | S | 0.01 | S | ||
Bekasi | 0.05186 | NS | 0.01 | S | 0.01 | S | ||
Bogor | 0.08977 | NS | 0.01 | S | 0.01 | S | ||
Cirebon | 0.07034 | NS | 0.01 | S | 0.01 | S | ||
Sukabumi | 0.1817 | NS | 0.01 | S | 0.01 | S | ||
Tasikmalaya | 0.2616 | NS | 0.01 | S | 0.02534 | S | ||
Majalengka | 0.2151 | NS | 0.01 | S | 0.01055 | S | ||
Indramayu | 0.2129 | NS | 0.01 | S | 0.01 | S | ||
Purwakarta | 0.1974 | NS | 0.01 | S | 0.01 | S | ||
Kuningan | 0.08829 | NS | 0.01 | S | 0.01 | S |
Location | Parameter | Estimator | Standard Error | -Value | -Value |
---|---|---|---|---|---|
Bandung | −0.07314 | 0.26782 | −0.273 | 0.784839 | |
−0.58219 | 0.29588 | −1.968 | 0.049418 | ||
−0.74419 | 0.28751 | −2.588 | 0.009798 | ||
Pangandaran | −0.67030 | 0.14174 | −4.729 | 2.61 × 10−6 | |
0.14381 | 0.18605 | 0.773 | 0.439728 | ||
−0.67203 | 0.28974 | −2.319 | 0.020594 | ||
Bekasi | −0.78844 | 0.15906 | −4.957 | 8.56 × 10−7 | |
0.28311 | 0.26578 | 1.065 | 0.287061 | ||
−0.78178 | 0.19671 | −3.974 | 7.63 × 10−5 | ||
Bogor | −0.97557 | 0.25831 | −3.777 | 0.000169 | |
0.48539 | 0.32335 | 1.501 | 0.133678 | ||
−0.65104 | 0.21253 | −3.063 | 0.002255 | ||
Cirebon | −0.93375 | 0.19567 | −4.772 | 2.13 × 10−6 | |
0.32321 | 0.21727 | 1.488 | 0.137198 | ||
−0.55124 | 0.18315 | −3.010 | 0.002688 | ||
Sukabumi | −0.53085 | 0.21078 | −2.518 | 0.011958 | |
0.01551 | 0.27051 | 0.057 | 0.954286 | ||
−0.49953 | 0.27432 | −1.821 | 0.068939 | ||
Tasikmalaya | −0.37805 | 0.23561 | −1.605 | 0.108942 | |
−0.18147 | 0.23449 | −0.774 | 0.439211 | ||
−0.39744 | 0.28871 | −1.377 | 0.168979 | ||
Majalengka | −0.31096 | 0.34039 | −0.914 | 0.361203 | |
−0.26507 | 0.29085 | −0.911 | 0.362337 | ||
−0.65516 | 0.28429 | −2.305 | 0.021417 | ||
Indramayu | −0.52592 | 0.40270 | −1.306 | 0.191898 | |
−0.17448 | 0.50440 | −0.346 | 0.729489 | ||
−0.51769 | 0.19117 | −2.708 | 0.006896 | ||
Purwakarta | −0.19875 | 0.29315 | −0.678 | 0.497955 | |
−0.51121 | 0.34563 | −1.479 | 0.139473 | ||
−0.57548 | 0.19799 | −2.907 | 0.003743 | ||
Kuningan | −0.76046 | 0.23199 | −3.278 | 0.001086 | |
0.10644 | 0.21439 | 0.496 | 0.619678 | ||
−0.52960 | 0.25072 | −2.112 | 0.034934 |
Lag | Q-Statistic | |
---|---|---|
1 | 133.1496 | 0.2121972 |
2 | 248.1893 | 0.3785795 |
3 | 369.4977 | 0.3957222 |
4 | 491.6240 | 0.3954386 |
5 | 588.6858 | 0.6751119 |
6 | 707.4616 | 0.6819559 |
7 | 810.7061 | 0.8101573 |
… | … | |
49 | 4617.5626 | 1.0000000 |
50 | 4640.3491 | 1.0000000 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Monika, P.; Ruchjana, B.N.; Abdullah, A.S. GSTARI-X-ARCH Model with Data Mining Approach for Forecasting Climate in West Java. Computation 2022, 10, 204. https://doi.org/10.3390/computation10120204
Monika P, Ruchjana BN, Abdullah AS. GSTARI-X-ARCH Model with Data Mining Approach for Forecasting Climate in West Java. Computation. 2022; 10(12):204. https://doi.org/10.3390/computation10120204
Chicago/Turabian StyleMonika, Putri, Budi Nurani Ruchjana, and Atje Setiawan Abdullah. 2022. "GSTARI-X-ARCH Model with Data Mining Approach for Forecasting Climate in West Java" Computation 10, no. 12: 204. https://doi.org/10.3390/computation10120204
APA StyleMonika, P., Ruchjana, B. N., & Abdullah, A. S. (2022). GSTARI-X-ARCH Model with Data Mining Approach for Forecasting Climate in West Java. Computation, 10(12), 204. https://doi.org/10.3390/computation10120204