Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data
2.2. ANN
2.3. ANFIS
2.4. Wavelet Transform
2.5. PCA
3. Modeling Procedures and Error Measures
4. Results and Discussion
4.1. ANN and ANFIS Models
4.2. Wavelet-ANN and -ANFIS Models
4.3. PCA-Wavelet-ANN and -ANFIS Models
5. Conclusions
- The models fed by the PCs have the highest performance among the other models demonstrating the PCA approach to catch suitable information from time series as well as reducing the dimension of the input variables.
- The wavelet-ANN model with the first four PCs has a better performance than the model with three PCs while for the ANFIS model the results were conversely. Therefore, more PCs for ANN and fewer PCs for the ANFIS models lead to the desired outputs.
- Using factor analysis improved the performance of the existing wavelet-ANN and ANFIS models while decreased computational time and complexity. Therefore, the proposed approach can be employed for forecasting of other time series as well.
- Among different models examined in this study, the PC1-3-WANFIS model indicating a wavelet-ANFIS model using three principal components from the decomposed time series has the best performance. The proposed models perform fast with accurate forecasts for a wide range of variation for the DO. Moreover, the model has an excellent performance to forecast extreme values which are of great performance for environmental management and planning.
- Results of this study that the factor analysis is a suitable proxy for dimensionality reduction of the forecasting models which improves the performance in terms of computational time and reliability of the outputs. The PCA has a great capability to detect the inter-correlation among time series which may lead to model misconduct if it does not manipulate accordingly.
Author Contributions
Acknowledgments
Conflicts of Interest
Nomenclature
PCA | Principal Component Analysis |
ANFIS | Adaptive Neuro-Fuzzy Inference System |
ANN | Artificial Neural Network |
FT | Fourier Transform |
CWT | Continuous Wavelet Transform |
DWT | Discrete Wavelet Transform |
RMSE | Root Mean Square Error |
CV | Coefficient of Variation |
DO | Dissolved Oxygen |
BOD | Biochemical Oxygen Demand |
Chl | Chlorophyll |
SC | Specific Conductivity |
Tur | Turbidity |
References
- Cox, B. A review of dissolved oxygen modelling techniques for lowland rivers. Sci. Total Environ. 2003, 314, 303–334. [Google Scholar] [CrossRef]
- Phelps, E.B.; Streeter, H. A Study of the Pollution and Natural Purification of the Ohio River; US Department of Health, Education, & Welfare: Washington, DC, USA, 1958.
- Bennett, J.P.; Rathbun, R. Reaeration in Open-Channel Flow; US Government Printing Office: Washington, DC, USA, 1971; Volume 737.
- Ahani, A.; Shourian, M.; Rad, P.R. Performance assessment of the linear, nonlinear and nonparametric data driven models in river flow forecasting. Water Res. Manag. 2018, 32, 383–399. [Google Scholar] [CrossRef]
- Anusree, K.; Varghese, K. Streamflow prediction of Karuvannur River Basin using ANFIS, ANN and MNLR models. Proc. Technol. 2016, 24, 101–108. [Google Scholar] [CrossRef] [Green Version]
- Dastorani, M.T.; Moghadamnia, A.; Piri, J.; Rico-Ramirez, M. Application of ANN and ANFIS models for reconstructing missing flow data. Environ. Monit. Assess. 2010, 166, 421–434. [Google Scholar] [CrossRef]
- Nourani, V.; Kisi, Ö.; Komasi, M. Two hybrid artificial intelligence approaches for modeling rainfall–runoff process. J. Hydrol. 2011, 402, 41–59. [Google Scholar] [CrossRef]
- Maier, H.R.; Dandy, G.C. The use of artificial neural networks for the prediction of water quality parameters. Water Resourc. Res. 1996, 32, 1013–1022. [Google Scholar] [CrossRef]
- Sarkar, A.; Pandey, P. River water quality modelling using artificial neural network technique. Aquat. Proc. 2015, 4, 1070–1077. [Google Scholar] [CrossRef]
- Heddam, S.; Kisi, O. Extreme learning machines: A new approach for modeling dissolved oxygen (DO) concentration with and without water quality variables as predictors. Environ. Sci. Pollut. Res. 2017, 24, 16702–16724. [Google Scholar] [CrossRef]
- Daliakopoulos, I.N.; Coulibaly, P.; Tsanis, I.K. Groundwater level forecasting using artificial neural networks. J. Hydrol. 2005, 309, 229–240. [Google Scholar] [CrossRef]
- Li, H.; Lu, Y.; Zheng, C.; Yang, M.; Li, S. Groundwater level prediction for the arid oasis of Northwest China based on the artificial bee colony algorithm and a back-propagation neural network with double hidden layers. Water 2019, 11, 860. [Google Scholar] [CrossRef] [Green Version]
- Gong, Y.; Wang, Z.; Xu, G.; Zhang, Z. A comparative study of groundwater level forecasting using data-driven models based on ensemble empirical mode decomposition. Water 2018, 10, 730. [Google Scholar] [CrossRef] [Green Version]
- Thai, M.T.; Wu, W.; Xiong, H. Big Data in Complex and Social Networks; CRC Press: London, UK, 2016. [Google Scholar]
- Hadi, S.J.; Tombul, M. Monthly streamflow forecasting using continuous wavelet and multi-gene genetic programming combination. J. Hydrol. 2018, 561, 674–687. [Google Scholar] [CrossRef]
- Nourani, V.; Parhizkar, M. Conjunction of SOM-based feature extraction method and hybrid wavelet-ANN approach for rainfall–runoff modeling. J. Hydroinform. 2013, 15, 829–848. [Google Scholar] [CrossRef]
- Pramanik, N.; Panda, R.K.; Singh, A. Daily river flow forecasting using wavelet ANN hybrid models. J. Hydroinform. 2011, 13, 49–63. [Google Scholar] [CrossRef] [Green Version]
- Adamowski, J.; Chan, H.F. A wavelet neural network conjunction model for groundwater level forecasting. J. Hydrol. 2011, 407, 28–40. [Google Scholar] [CrossRef]
- Sharghi, E.; Nourani, V.; Molajou, A.; Najafi, H. Conjunction of emotional ANN (EANN) and wavelet transform for rainfall-runoff modeling. J. Hydroinform. 2019, 21, 136–152. [Google Scholar] [CrossRef] [Green Version]
- Zhang, X.; Wei, Z. A hybrid model based on principal component analysis, wavelet transform, and extreme learning machine optimized by Bat algorithm for daily solar radiation forecasting. Sustainability 2019, 11, 4138. [Google Scholar] [CrossRef] [Green Version]
- Solgi, A.; Pourhaghi, A.; Bahmani, R.; Zarei, H. Improving SVR and ANFIS performance using wavelet transform and PCA algorithm for modeling and predicting biochemical oxygen demand (BOD). Ecohydrol. Hydrobiol. 2017, 17, 164–175. [Google Scholar] [CrossRef]
- Heddam, S.; Sanikhani, H.; Kisi, O. Application of artificial intelligence to estimate phycocyanin pigment concentration using water quality data: A comparative study. Appl. Water Sci. 2019, 9, 164. [Google Scholar] [CrossRef] [Green Version]
- Zurada, J.M. Introduction to Artificial Neural Systems; West Group: West St. Paul, MN, USA, 1992; Volume 8. [Google Scholar]
- Beale, H.D.; Demuth, H.B.; Hagan, M. Neural Network Design; PWS: Boston, MA, USA, 1996. [Google Scholar]
- Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef] [Green Version]
- Takagi, T.; Sugeno, M. Fuzzy identification of systems and its applications to modeling and control. IEEE Trans. Syst. Man Cybern. 1985, 1, 116–132. [Google Scholar] [CrossRef]
- Chang, F.-J.; Chang, Y.-T. Adaptive neuro-fuzzy inference system for prediction of water level in reservoir. Adv. Water Resour. 2006, 29, 1–10. [Google Scholar] [CrossRef]
- Akansu, A.N.; Haddad, P.A.; Haddad, R.A.; Haddad, P.R. Multiresolution Signal Decomposition: Transforms, Subbands, and Wavelets; Academic Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Mallat, S. A Wavelet Tour of Signal Processing; Academic Press: San Diego, CA, USA; London, UK; Boston, MA, USA; New York, NY, USA; Sydney, NSW, Australia; Tokyo, Japan; Toronto, ON, Canada, 1998. [Google Scholar]
- Cattell, R. The scree test for the number of factors. Multivar. Behav. Res. 1996, 1, 629–637. [Google Scholar] [CrossRef] [PubMed]
- Crane, D.R.; Busby, D.M.; Larson, J.H. A factor analysis of the Dyadic Adjustment Scale with distressed and nondistressed couples. Am. J. Fam. Ther. 1991, 19, 60–66. [Google Scholar] [CrossRef]
- Sahoo, M.M.; Patra, K.; Khatua, K. Inference of water quality index using ANFIA and PCA. Aquat. Proc. 2015, 4, 1099–1106. [Google Scholar] [CrossRef]
Variable | Min | Max | Mean | Skew | CV (%) | CC |
---|---|---|---|---|---|---|
Chl (μg/L) | 0.52 | 10.37 | 1.81 | 2.36 | 73 | 0.75 |
T (°C) | 4.45 | 24.87 | 13 | 0.45 | 47 | 0.82 |
SC (μS/cm) | 53.17 | 106.17 | 80.94 | −0.29 | 11 | 0.97 |
Turbidity (FNU) | 1.00 | 60.57 | 6.50 | 3.10 | 122 | 0.70 |
DO (mg/L) | 6.91 | 14.30 | 11.00 | −0.23 | 17 | 1 |
Model | Training | Testing | Run Time | ||
---|---|---|---|---|---|
R2 | RMSE | R2 | RMSE | ||
ANN | 0.97 | 0.45 | 0.97 | 0.45 | 8.6 |
ANFIS | 0.99 | 0.19 | 0.92 | 0.56 | 20.6 |
Model | Training | Testing | Run Time (s) | ||
---|---|---|---|---|---|
R2 | RMSE (mg/L) | R2 | RMSE (mg/L) | ||
WANN | 0.97 | 0.43 | 0.97 | 0.52 | 9.0 |
WANFIS | NAN | NAN | NAN | NAN | NAN |
Model | Training | Testing | Run Time | ||
---|---|---|---|---|---|
R2 | RMSE | R2 | RMSE | ||
PC1-3-WANN | 0.96 | 0.40 | 0.92 | 0.58 | 9.3 |
PC1-4-WANN | 0.98 | 0.25 | 0.97 | 0.36 | 9.6 |
PC1-3-WANFIS | 0.98 | 0.25 | 0.97 | 0.36 | 6.5 |
PC1-4-WANFIS | 0.99 | 0.18 | 0.88 | 1.017 | 8.8 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Meng, Y.; Qasem, S.N.; Shokri, M.; S, S. Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis. Mathematics 2020, 8, 1233. https://doi.org/10.3390/math8081233
Meng Y, Qasem SN, Shokri M, S S. Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis. Mathematics. 2020; 8(8):1233. https://doi.org/10.3390/math8081233
Chicago/Turabian StyleMeng, Yinghui, Sultan Noman Qasem, Manouchehr Shokri, and Shahab S. 2020. "Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis" Mathematics 8, no. 8: 1233. https://doi.org/10.3390/math8081233
APA StyleMeng, Y., Qasem, S. N., Shokri, M., & S, S. (2020). Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis. Mathematics, 8(8), 1233. https://doi.org/10.3390/math8081233