A Study of Optimization in Deep Neural Networks for Regression
Abstract
:1. Introduction
2. Sections of Optimization of Deep Neural Networks for Regression
2.1. Data Preprocessing
2.2. Network Architecture Selection
2.3. Selection of Optimizers
2.4. Hyperparameter Tuning
3. The Systemic and Global Optimization of DNNR Models
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ghimire, S.; Deo, R.C.; Raj, N.; Mi, J. Deep Learning Neural Networks Trained with MODIS Satellite-Derived Predictors for Long-Term Global Solar Radiation Prediction. Energies 2019, 12, 2407. [Google Scholar] [CrossRef] [Green Version]
- Nasser, A.A.; Rashad, M.Z.; Hussein, S.E. A Two-Layer Water Demand Prediction System in Urban Areas Based on Micro-Services and LSTM Neural Networks. IEEE Access 2020, 8, 147647–147661. [Google Scholar] [CrossRef]
- Hoekendijk, J.P.A.; Kellenberger, B.; Aarts, G.; Brasseur, S.; Poiesz, S.S.H.; Tuia, D. Counting Using Deep Learning Regression Gives Value to Ecological Surveys. Sci. Rep. 2021, 11, 23209. [Google Scholar] [CrossRef] [PubMed]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Deep Learning in Agriculture: A Survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef] [Green Version]
- Wang, H.; Lei, Z.; Zhang, X.; Zhou, B.; Peng, J. A Review of Deep Learning for Renewable Energy Forecasting. Energy Convers. Manag. 2019, 198, 111799. [Google Scholar] [CrossRef]
- Xiong, Y.; Zhou, Y.; Wang, F.; Wang, S.; Wang, J.; Ji, J.; Wang, Z. Landslide Susceptibility Mapping Using Ant Colony Optimization Strategy and Deep Belief Network in Jiuzhaigou Region. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 11042–11057. [Google Scholar] [CrossRef]
- Liu, J.; Jiang, R.; Zhu, D.; Zhao, J. Short-Term Subway Inbound Passenger Flow Prediction Based on AFC Data and PSO-LSTM Optimized Model. Urban. Rail Transit. 2022, 8, 56–66. [Google Scholar] [CrossRef]
- Gao, Y.; Wang, R.; Zhou, E. Stock Prediction Based on Optimized LSTM and GRU Models. Sci. Program. 2021, 2021, 4055281. [Google Scholar] [CrossRef]
- Shrestha, A.; Mahmood, A. Review of Deep Learning Algorithms and Architectures. IEEE Access 2019, 7, 53040–53065. [Google Scholar] [CrossRef]
- Dong, S.; Wang, P.; Abbas, K. A Survey on Deep Learning and Its Applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
- Abd Elaziz, M.; Dahou, A.; Abualigah, L.; Yu, L.; Alshinwan, M.; Khasawneh, A.M.; Lu, S. Advanced Metaheuristic Optimization Techniques in Applications of Deep Neural Networks: A Review. Neural Comput. Appl. 2021, 33, 14079–14099. [Google Scholar] [CrossRef]
- Akay, B.; Karaboga, D.; Akay, R. A Comprehensive Survey on Optimizing Deep Learning Models by Metaheuristics. Artif. Intell. Rev. 2022, 55, 829–894. [Google Scholar] [CrossRef]
- Zhan, Z.-H.; Li, J.-Y.; Zhang, J. Evolutionary Deep Learning: A Survey. Neurocomputing 2022, 483, 42–58. [Google Scholar] [CrossRef]
- Shickel, B.; Tighe, P.J.; Bihorac, A.; Rashidi, P. Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis. IEEE J. Biomed. Health Inform. 2017, 22, 1589–1604. [Google Scholar] [CrossRef] [PubMed]
- Abram, K.J.; McCloskey, D. A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning. Metabolites 2022, 12, 202. [Google Scholar] [CrossRef]
- Han, D.; Yang, X.; Li, G.; Wang, S.; Wang, Z.; Zhao, J. Highway Traffic Speed Prediction in Rainy Environment Based on APSO-GRU. J. Adv. Transp. 2021, 2021, 4060740. [Google Scholar] [CrossRef]
- Tsokov, S.; Lazarova, M.; Aleksieva-Petrova, A. A Hybrid Spatiotemporal Deep Model Based on CNN and LSTM for Air Pollution Prediction. Sustainability 2022, 14, 5104. [Google Scholar] [CrossRef]
- Jia, P.; Liu, H.; Wang, S.; Wang, P. Research on a Mine Gas Concentration Forecasting Model Based on a GRU Network. IEEE Access 2020, 8, 38023–38031. [Google Scholar] [CrossRef]
- Shao, B.; Song, D.; Bian, G.; Zhao, Y. A Hybrid Approach by CEEMDAN-Improved PSO-LSTM Model for Network Traffic Prediction. Secur. Commun. Netw. 2022, 2022, 4975288. [Google Scholar] [CrossRef]
- Yan, J.; Gao, Y.; Yu, Y.; Xu, H.; Xu, Z. A Prediction Model Based on Deep Belief Network and Least Squares SVR Applied to Cross-Section Water Quality. Water 2020, 12, 1929. [Google Scholar] [CrossRef]
- Zhang, Y. Short-Term Power Load Forecasting Based on SAPSO-CNN-LSTM Model Considering Autocorrelated Errors. Math. Probl. Eng. 2022, 2022, 2871889. [Google Scholar] [CrossRef]
- Abidi, M.H.; Alkhalefah, H.; Mohammed, M.K.; Umer, U.; Qudeiri, J.E.A. Optimal Scheduling of Flexible Manufacturing System Using Improved Lion-Based Hybrid Machine Learning Approach. IEEE Access 2020, 8, 96088–96114. [Google Scholar] [CrossRef]
- Chen, X. Emotional Calculation Method of Rural Tourist Based on Improved SPCA-LSTM Algorithm. J. Sens. 2022, 2022, 3365498. [Google Scholar] [CrossRef]
- Cheng, T.; Harrou, F.; Kadri, F.; Sun, Y.; Leiknes, T. Forecasting of Wastewater Treatment Plant Key Features Using Deep Learning-Based Models: A Case Study. IEEE Access 2020, 8, 184475–184485. [Google Scholar] [CrossRef]
- Zhao, A.; Mi, L.; Xue, X.; Xi, J.; Jiao, Y. Heating Load Prediction of Residential District Using Hybrid Model Based on CNN. Energy Build. 2022, 266, 112122. [Google Scholar] [CrossRef]
- Ghimire, S.; Bhandari, B.; Casillas-Pérez, D.; Deo, R.C.; Salcedo-Sanz, S. Hybrid Deep CNN-SVR Algorithm for Solar Radiation Prediction Problems in Queensland, Australia. Eng. Appl. Artif. Intell. 2022, 112, 104860. [Google Scholar] [CrossRef]
- Liu, W.; Yu, H.; Yang, L.; Yin, Z.; Zhu, M.; Wen, X. Deep Learning-Based Predictive Framework for Groundwater Level Forecast in Arid Irrigated Areas. Water 2021, 13, 2558. [Google Scholar] [CrossRef]
- Wang, F.; Ma, S.; Wang, H.; Li, Y.; Zhang, J. Prediction of NO X Emission for Coal-Fired Boilers Based on Deep Belief Network. Control Eng. Pract. 2018, 80, 26–35. [Google Scholar] [CrossRef]
- Lian, P.; Liu, H.; Wang, X.; Guo, R. Soft Sensor Based on DBN-IPSO-SVR Approach for Rotor Thermal Deformation Prediction of Rotary Air-Preheater. Measurement 2020, 165, 108109. [Google Scholar] [CrossRef]
- Song, J.; Xue, G.; Ma, Y.; Li, H.; Pan, Y.; Hao, Z. An Indoor Temperature Prediction Framework Based on Hierarchical Attention Gated Recurrent Unit Model for Energy Efficient Buildings. IEEE Access 2019, 7, 157268–157283. [Google Scholar] [CrossRef]
- He, Q.-Q.; Wu, C.; Si, Y.-W. LSTM with Particle Swam Optimization for Sales Forecasting. Electron. Commer. Res. Appl. 2022, 51, 101118. [Google Scholar] [CrossRef]
- Li, X.; Liu, B.; Qian, W.; Rao, G.; Chen, L.; Cui, J. Design of Soft-Sensing Model for Alumina Concentration Based on Improved Deep Belief Network. Processes 2022, 10, 2537. [Google Scholar] [CrossRef]
- Jiang, H.; Hu, W.; Xiao, L.; Dong, Y. A Decomposition Ensemble Based Deep Learning Approach for Crude Oil Price Forecasting. Resour. Policy 2022, 78, 102855. [Google Scholar] [CrossRef]
- Gao, B.; Huang, X.; Shi, J.; Tai, Y.; Zhang, J. Hourly Forecasting of Solar Irradiance Based on CEEMDAN and Multi-Strategy CNN-LSTM Neural Networks. Renew. Energy 2020, 162, 1665–1683. [Google Scholar] [CrossRef]
- Yang, H.; Liu, S. Water Quality Prediction in Sea Cucumber Farming Based on a GRU Neural Network Optimized by an Improved Whale Optimization Algorithm. PeerJ Comput. Sci. 2022, 8, e1000. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Tang, J.; Cheng, Y.; Huang, L.; Guo, F.; Yin, X.; Li, N. Prediction of Landslide Displacement with Dynamic Features Using Intelligent Approaches. Int. J. Min. Sci. Technol. 2022, 32, 539–549. [Google Scholar] [CrossRef]
- Hu, H.; Xia, X.; Luo, Y.; Zhang, C.; Nazir, M.S.; Peng, T. Development and Application of an Evolutionary Deep Learning Framework of LSTM Based on Improved Grasshopper Optimization Algorithm for Short-Term Load Forecasting. J. Build. Eng. 2022, 57, 104975. [Google Scholar] [CrossRef]
- Wang, H.; Xiong, M.; Chen, H.; Liu, S. Multi-Step Ahead Wind Speed Prediction Based on a Two-Step Decomposition Technique and Prediction Model Parameter Optimization. Energy Rep. 2022, 8, 6086–6100. [Google Scholar] [CrossRef]
- Duan, J.; Chang, M.; Chen, X.; Wang, W.; Zuo, H.; Bai, Y.; Chen, B. A Combined Short-Term Wind Speed Forecasting Model Based on CNN–RNN and Linear Regression Optimization Considering Error. Renew. Energy 2022, 200, 788–808. [Google Scholar] [CrossRef]
- Liu, H.; Mi, X.; Li, Y.; Duan, Z.; Xu, Y. Smart Wind Speed Deep Learning Based Multi-Step Forecasting Model Using Singular Spectrum Analysis, Convolutional Gated Recurrent Unit Network and Support Vector Regression. Renew. Energy 2019, 143, 842–854. [Google Scholar] [CrossRef]
- Shang, Z.; Wen, Q.; Chen, Y.; Zhou, B.; Xu, M. Wind Speed Forecasting Using Attention-Based Causal Convolutional Network and Wind Energy Conversion. Energies 2022, 15, 2881. [Google Scholar] [CrossRef]
- Wang, S.; Qin, C.; Feng, Q.; Javadpour, F.; Rui, Z. A Framework for Predicting the Production Performance of Unconventional Resources Using Deep Learning. Appl. Energy 2021, 295, 117016. [Google Scholar] [CrossRef]
- Tuerxun, W.; Xu, C.; Guo, H.; Guo, L.; Zeng, N.; Cheng, Z. An Ultra-short-term Wind Speed Prediction Model Using LSTM Based on Modified Tuna Swarm Optimization and Successive Variational Mode Decomposition. Energy Sci. Eng. 2022, 10, 3001–3022. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, W.; Sun, J.; Wang, L.; Song, X.; Zhao, X. Survival Prediction Model for Patients with Esophageal Squamous Cell Carcinoma Based on the Parameter-Optimized Deep Belief Network Using the Improved Archimedes Optimization Algorithm. Comput. Math. Methods Med. 2022, 2022, 1924906. [Google Scholar] [CrossRef] [PubMed]
- Mohammed, G.P.; Alasmari, N.; Alsolai, H.; Alotaibi, S.S.; Alotaibi, N.; Mohsen, H. Autonomous Short-Term Traffic Flow Prediction Using Pelican Optimization with Hybrid Deep Belief Network in Smart Cities. Appl. Sci. 2022, 12, 10828. [Google Scholar] [CrossRef]
- Zhang, Y.; Gao, G. Optimization and Evaluation of an Intelligent Short-Term Blood Glucose Prediction Model Based on Noninvasive Monitoring and Deep Learning Techniques. J. Healthc. Eng. 2022, 2022, 8956850. [Google Scholar] [CrossRef]
- Yang, C.-H.; Chen, B.-H.; Wu, C.-H.; Chen, K.-C.; Chuang, L.-Y. Deep Learning for Forecasting Electricity Demand in Taiwan. Mathematics 2022, 10, 2547. [Google Scholar] [CrossRef]
- Gao, S.; Huang, Y.; Zhang, S.; Han, J.; Wang, G.; Zhang, M.; Lin, Q. Short-Term Runoff Prediction with GRU and LSTM Networks without Requiring Time Step Optimization during Sample Generation. J. Hydrol. 2020, 589, 125188. [Google Scholar] [CrossRef]
- Li, J.; Zhang, Z.; Wang, X.; Yan, W. Intelligent Decision-Making Model in Preventive Maintenance of Asphalt Pavement Based on PSO-GRU Neural Network. Adv. Eng. Inform. 2022, 51, 101525. [Google Scholar] [CrossRef]
- Meng, X.; Wang, R.; Zhang, X.; Wang, M.; Ma, H.; Wang, Z. Hybrid Neural Network Based on GRU with Uncertain Factors for Forecasting Ultra-Short-Term Wind Power. In Proceedings of the 2020 2nd International Conference on Industrial Artificial Intelligence (IAI), Shenyang, China, 23–25 October 2020; pp. 1–6. [Google Scholar]
- Saini, V.K.; Bhardwaj, B.; Gupta, V.; Kumar, R.; Mathur, A. Gated recurrent unit (gru) based short term forecasting for wind energy estimation. In Proceedings of the 2020 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India, 10–11 December 2020; pp. 1–6. [Google Scholar]
- Li, Q.; Chai, X.; Zhang, C.; Wang, X.; Ma, W. Prediction Model of Ischemic Stroke Recurrence Using PSO-LSTM in Mobile Medical Monitoring System. Comput. Intell. Neurosci. 2022, 2022, 8936103. [Google Scholar] [CrossRef]
- Qiu, K.; Li, J.; Chen, D. Optimized Long Short-Term Memory (LSTM) Network for Performance Prediction in Unconventional Reservoirs. Energy Rep. 2022, 8, 15436–15445. [Google Scholar] [CrossRef]
- Zhou, B.; Ma, X.; Luo, Y.; Yang, D. Wind Power Prediction Based on LSTM Networks and Nonparametric Kernel Density Estimation. IEEE Access 2019, 7, 165279–165292. [Google Scholar] [CrossRef]
- Chen, F.; Gao, X.; Xia, X.; Xu, J. Using LSTM and PSO Techniques for Predicting Moisture Content of Poplar Fibers by Impulse-Cyclone Drying. PLoS ONE 2022, 17, e0266186. [Google Scholar] [CrossRef] [PubMed]
- Aladag, C.H. Architecture Selection in Neural Networks by Statistical and Machine Learning. Orient. J. Comp. Sci. Technol. 2019, 12, 76–89. [Google Scholar] [CrossRef]
- Sen, J.; Mehtab, S. Accurate Stock Price Forecasting Using Robust and Optimized Deep Learning Models. In Proceedings of the 2021 International Conference on Intelligent Technologies (CONIT), Hubli, India, 25–27 June 2021; pp. 1–9. [Google Scholar]
- Shi, X.; Huang, G.; Hao, X.; Yang, Y.; Li, Z. A Synchronous Prediction Model Based on Multi-Channel CNN with Moving Window for Coal and Electricity Consumption in Cement Calcination Process. Sensors 2021, 21, 4284. [Google Scholar] [CrossRef] [PubMed]
- Itakura, K.; Saito, Y.; Suzuki, T.; Kondo, N.; Hosoi, F. Estimation of Citrus Maturity with Florescence Spectroscopy Using Deep Learning. Horticulturae 2018, 5, 2. [Google Scholar] [CrossRef] [Green Version]
- Kulshrestha, A.; Krishnaswamy, V.; Sharma, M. Bayesian BILSTM Approach for Tourism Demand Forecasting. Ann. Tour. Res. 2020, 83, 102925. [Google Scholar] [CrossRef]
- Di, Y.; Gao, M.; Feng, F.; Li, Q.; Zhang, H. A New Framework for Winter Wheat Yield Prediction Integrating Deep Learning and Bayesian Optimization. Agronomy 2022, 12, 3194. [Google Scholar] [CrossRef]
- Yalçın, S.; Panchal, S.; Herdem, M.S. A CNN-ABC Model for Estimation and Optimization of Heat Generation Rate and Voltage Distributions of Lithium-Ion Batteries for Electric Vehicles. Int. J. Heat. Mass. Transf. 2022, 199, 123486. [Google Scholar] [CrossRef]
- Rajamoorthy, R.; Saraswathi, H.V.; Devaraj, J.; Kasinathan, P.; Elavarasan, R.M.; Arunachalam, G.; Mostafa, T.M.; Mihet-Popa, L. A Hybrid Sailfish Whale Optimization and Deep Long Short-Term Memory (SWO-DLSTM) Model for Energy Efficient Autonomy in India by 2048. Sustainability 2022, 14, 1355. [Google Scholar] [CrossRef]
- Ge, S.; Gao, W.; Cui, S.; Chen, X.; Wang, S. Safety Prediction of Shield Tunnel Construction Using Deep Belief Network and Whale Optimization Algorithm. Autom. Constr. 2022, 142, 104488. [Google Scholar] [CrossRef]
- Sun, L.; Qin, H.; Przystupa, K.; Majka, M.; Kochan, O. Individualized Short-Term Electric Load Forecasting Using Data-Driven Meta-Heuristic Method Based on LSTM Network. Sensors 2022, 22, 7900. [Google Scholar] [CrossRef] [PubMed]
- Hakim, W.L.; Rezaie, F.; Nur, A.S.; Panahi, M.; Khosravi, K.; Lee, C.-W.; Lee, S. Convolutional Neural Network (CNN) with Metaheuristic Optimization Algorithms for Landslide Susceptibility Mapping in Icheon, South Korea. J. Environ. Manag. 2022, 305, 114367. [Google Scholar] [CrossRef]
- Pan, J.; Jing, B.; Jiao, X.; Wang, S. Analysis and Application of Grey Wolf Optimizer-Long Short-Term Memory. IEEE Access 2020, 8, 121460–121468. [Google Scholar] [CrossRef]
- Mahdaddi, A.; Meshoul, S.; Belguidoum, M. EA-Based Hyperparameter Optimization of Hybrid Deep Learning Models for Effective Drug-Target Interactions Prediction. Expert Syst. Appl. 2021, 185, 115525. [Google Scholar] [CrossRef]
- Wang, Y.; Wei, S.; Yang, W.; Chai, Y.; Li, P. Construction of Offline Predictive Controller for Wind Farm Based on CNN–GRNN. Control Eng. Pract. 2022, 127, 105290. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, S.; Wang, P.; Jiang, P.; Zhou, H. Research on Fault Early Warning of Wind Turbine Based on IPSO-DBN. Energies 2022, 15, 9072. [Google Scholar] [CrossRef]
- Hu, Y.; Wei, R.; Yang, Y.; Li, X.; Huang, Z.; Liu, Y.; He, C.; Lu, H. Performance Degradation Prediction Using LSTM with Optimized Parameters. Sensors 2022, 22, 2407. [Google Scholar] [CrossRef]
- Gao, J.; Wang, X.; Yang, W. SPSO-DBN Based Compensation Algorithm for Lackness of Electric Energy Metering in Micro-Grid. Alex. Eng. J. 2022, 61, 4585–4594. [Google Scholar] [CrossRef]
- Gao, S.; Xu, L.; Zhang, Y.; Pei, Z. Rolling Bearing Fault Diagnosis Based on SSA Optimized Self-Adaptive DBN. ISA Trans. 2022, 128, 485–502. [Google Scholar] [CrossRef]
- Wang, X.; Yan, C.; Liu, W.; Liu, X. Research on Carbon Emissions Prediction Model of Thermal Power Plant Based on SSA-LSTM Algorithm with Boiler Feed Water Influencing Factors. Sustainability 2022, 14, 15988. [Google Scholar] [CrossRef]
- Shao, B.; Song, D.; Bian, G.; Zhao, Y. Wind Speed Forecast Based on the LSTM Neural Network Optimized by the Firework Algorithm. Adv. Mater. Sci. Eng. 2021, 2021, 4874757. [Google Scholar] [CrossRef]
- Eid, M.M.; El-Kenawy, E.-S.M.; Khodadadi, N.; Mirjalili, S.; Khodadadi, E.; Abotaleb, M.; Alharbi, A.H.; Abdelhamid, A.A.; Ibrahim, A.; Amer, G.M.; et al. Meta-Heuristic Optimization of LSTM-Based Deep Network for Boosting the Prediction of Monkeypox Cases. Mathematics 2022, 10, 3845. [Google Scholar] [CrossRef]
- Yin, X.; Liu, Q.; Huang, X.; Pan, Y. Real-Time Prediction of Rockburst Intensity Using an Integrated CNN-Adam-BO Algorithm Based on Microseismic Data and Its Engineering Application. Tunn. Undergr. Space Technol. 2021, 117, 104133. [Google Scholar] [CrossRef]
- Yu, Z.; Sun, Y.; Zhang, J.; Zhang, Y.; Liu, Z. Gated Recurrent Unit Neural Network (GRU) Based on Quantile Regression (QR) Predicts Reservoir Parameters through Well Logging Data. Front. Earth Sci. 2023, 11, 1087385. [Google Scholar] [CrossRef]
- Wei, D.; Wang, J.; Niu, X.; Li, Z. Wind Speed Forecasting System Based on Gated Recurrent Units and Convolutional Spiking Neural Networks. Appl. Energy 2021, 292, 116842. [Google Scholar] [CrossRef]
- Wang, J.; Li, Q.; Zhang, H.; Wang, Y. A Deep-Learning Wind Speed Interval Forecasting Architecture Based on Modified Scaling Approach with Feature Ranking and Two-Output Gated Recurrent Unit. Expert Syst. Appl. 2023, 211, 118419. [Google Scholar] [CrossRef]
- Li, Y.-Q.; Zhao, H.-W.; Yue, Z.-X.; Li, Y.-W.; Zhang, Y.; Zhao, D.-C. Real-Time Intelligent Prediction Method of Cable’s Fundamental Frequency for Intelligent Maintenance of Cable-Stayed Bridges. Sustainability 2023, 15, 4086. [Google Scholar] [CrossRef]
- Xu, Y.; Zhang, J.; Long, Z.; Tang, H.; Zhang, X. Hourly Urban Water Demand Forecasting Using the Continuous Deep Belief Echo State Network. Water 2019, 11, 351. [Google Scholar] [CrossRef] [Green Version]
- Uzair, M.; Jamil, N. Effects of Hidden Layers on the Efficiency of Neural Networks. In Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), Bahawalpur, Pakistan, 5–7 November 2020; pp. 1–6. [Google Scholar]
- Li, W.; Wu, H.; Zhu, N.; Jiang, Y.; Tan, J.; Guo, Y. Prediction of Dissolved Oxygen in a Fishery Pond Based on Gated Recurrent Unit (GRU). Inf. Process. Agric. 2021, 8, 185–193. [Google Scholar] [CrossRef]
- Hu, Z. Prediction Model of Rotor Yarn Quality Based on CNN-LSTM. J. Sens. 2022, 2022, 3955047. [Google Scholar] [CrossRef]
- Li, H.; Zhao, Z.; Du, X. Research and Application of Deformation Prediction Model for Deep Foundation Pit Based on LSTM. Wirel. Commun. Mob. Comput. 2022, 2022, 9407999. [Google Scholar] [CrossRef]
- Chen, J.; Zhang, H.; Zhang, W.; Du, X.; Zhang, Y.; Li, S. Correlated Regression Feature Learning for Automated Right Ventricle Segmentation. IEEE J. Transl. Eng. Health Med. 2018, 6, 1800610. [Google Scholar] [CrossRef]
- Aslam, S.; Ayub, N.; Farooq, U.; Alvi, M.J.; Albogamy, F.R.; Rukh, G.; Haider, S.I.; Azar, A.T.; Bukhsh, R. Towards Electric Price and Load Forecasting Using CNN-Based Ensembler in Smart Grid. Sustainability 2021, 13, 12653. [Google Scholar] [CrossRef]
- Zou, M.; Zhu, S.; Gu, J.; Korunovic, L.M.; Djokic, S.Z. Heating and Lighting Load Disaggregation Using Frequency Components and Convolutional Bidirectional Long Short-Term Memory Method. Energies 2021, 14, 4831. [Google Scholar] [CrossRef]
- Miao, S.; Wang, Z.J.; Liao, R. A CNN Regression Approach for Real-Time 2D/3D Registration. IEEE Trans. Med. Imaging 2016, 35, 1352–1363. [Google Scholar] [CrossRef]
- Zhao, Y.; Hu, H.; Song, C.; Wang, Z. Predicting Compressive Strength of Manufactured-Sand Concrete Using Conventional and Metaheuristic-Tuned Artificial Neural Network. Measurement 2022, 194, 110993. [Google Scholar] [CrossRef]
- Rather, A.M. LSTM-Based Deep Learning Model for Stock Prediction and Predictive Optimization Model. EURO J. Decis. Process. 2021, 9, 100001. [Google Scholar] [CrossRef]
- Pattana-Anake, V.; Joseph, F.J.J. Hyper Parameter Optimization of Stack LSTM Based Regression for PM 2.5 Data in Bangkok. In Proceedings of the 2022 7th International Conference on Business and Industrial Research (ICBIR), Bangkok, Thailand, 19–20 May 2022; pp. 13–17. [Google Scholar]
- Bian, J.; Wang, L.; Scherer, R.; Wozniak, M.; Zhang, P.; Wei, W. Abnormal Detection of Electricity Consumption of User Based on Particle Swarm Optimization and Long Short Term Memory With the Attention Mechanism. IEEE Access 2021, 9, 47252–47265. [Google Scholar] [CrossRef]
- Yan, J.; Chen, X.; Yu, Y.; Zhang, X. Application of a Parallel Particle Swarm Optimization-Long Short Term Memory Model to Improve Water Quality Data. Water 2019, 11, 1317. [Google Scholar] [CrossRef] [Green Version]
- Kaselimi, M.; Doulamis, N.; Doulamis, A.; Voulodimos, A.; Protopapadakis, E. Bayesian-Optimized Bidirectional LSTM Regression Model for Non-Intrusive Load Monitoring. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 2747–2751. [Google Scholar]
- Wang, Y.; Feng, B.; Hua, Q.-S.; Sun, L. Short-Term Solar Power Forecasting: A Combined Long Short-Term Memory and Gaussian Process Regression Method. Sustainability 2021, 13, 3665. [Google Scholar] [CrossRef]
- Islam, M.S.; Hossain, E. Foreign Exchange Currency Rate Prediction Using a GRU-LSTM Hybrid Network. Soft Comput. Lett. 2021, 3, 100009. [Google Scholar] [CrossRef]
- Violos, J.; Tsanakas, S.; Theodoropoulos, T.; Leivadeas, A.; Tserpes, K.; Varvarigou, T. Hypertuning GRU Neural Networks for Edge Resource Usage Prediction. In Proceedings of the 2021 IEEE Symposium on Computers and Communications (ISCC), Athens, Greece, 5–8 September 2021; pp. 1–8. [Google Scholar]
- Wang, J.; Cao, J.; Yuan, S.; Cheng, M. Short-Term Forecasting of Natural Gas Prices by Using a Novel Hybrid Method Based on a Combination of the CEEMDAN-SE-and the PSO-ALS-Optimized GRU Network. Energy 2021, 233, 121082. [Google Scholar] [CrossRef]
- Cao, M.; Zhang, T.; Liu, Y.; Wang, Y.; Shi, Z. A Bayesian Optimization Hyperband-Optimized Incremental Deep Belief Network for Online Battery Behaviour Modelling for a Satellite Simulator. J. Energy Storage 2023, 58, 106348. [Google Scholar] [CrossRef]
- Wang, S.; Lin, X.; Qi, X.; Li, H.; Yang, J. Landslide Susceptibility Analysis Based on a PSO-DBN Prediction Model in an Earthquake-Stricken Area. Front. Environ. Sci. 2022, 10, 912523. [Google Scholar] [CrossRef]
- Cao, M.; Zhang, T.; Wang, J.; Liu, Y. A Deep Belief Network Approach to Remaining Capacity Estimation for Lithium-Ion Batteries Based on Charging Process Features. J. Energy Storage 2022, 48, 103825. [Google Scholar] [CrossRef]
- Choi, R.Y.; Coyner, A.S.; Kalpathy-Cramer, J.; Chiang, M.F.; Campbell, J.P. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl. Vis. Sci. Technol. 2020, 9, 14. [Google Scholar]
- Zhao, X.; Liu, D.; Yan, X. Diameter Prediction of Silicon Ingots in the Czochralski Process Based on a Hybrid Deep Learning Model. Crystals 2022, 13, 36. [Google Scholar] [CrossRef]
- Smith, L.N. Cyclical Learning Rates for Training Neural Networks. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 464–472. [Google Scholar]
- Lee, S.; Kim, J.; Kang, H.; Kang, D.-Y.; Park, J. Genetic Algorithm Based Deep Learning Neural Network Structure and Hyperparameter Optimization. Appl. Sci. 2021, 11, 744. [Google Scholar] [CrossRef]
- Shah, B.; Bhavsar, H. Time Complexity in Deep Learning Models. Procedia Comput. Sci. 2022, 215, 202–210. [Google Scholar] [CrossRef]
Literature | Platforms | Sections of Optimization |
---|---|---|
This study | 1. TensorFlow with Keras 2. PyTorch 3. Caffe 4. Theano with Keras 5. Matlab | 1. Data preprocessing 2. Architectures 3. Optimizers 4. Hyperparameters |
[10] | TensorFlow | Hyperparameters |
[5,13] | Unavailable | 1. Data preprocessing 2. Architectures 3. Hyperparameters |
[11,12] | Unavailable | 1. Architectures 2. Hyperparameters |
Sections | Parameters | Data Types |
---|---|---|
Architectures | Number of layers | Integer |
Number of nodes | Integer | |
Number of kernel sizes | Integer | |
Number of pooling | Integer | |
Optimizers | Optimizers | Categories (Adadelta, Adagrad, Adam, Adamx, Ftrl, Nadam, RMSprop, SGD, SparseAdam, Adamax, ASGD, LBFGS, NAdam, RAdam, Rprop, Nesterov) |
Hyperparameters | Learning rates | Real |
Number of epochs | Integer | |
Batch sizes | Integer | |
Iterations | Integer | |
Dropout rates | Real | |
Data preprocessing | Tasks | Categories (missing values, dimensionality reduction, removing outliers, feature selection, decomposition, normalization) |
Tasks of Data Preprocessing | Methods |
---|---|
Missing values | Supplementing average value [16,17], linear interpolation [17,18], KNN [19,20] |
Dimensionality reduction | PCA [8], pooling layer [21], t-SNE [22], SPCA [23] |
Removing outliers | Pauta criterion [18], EWMA [24] |
Feature selection | PSO [1], LASSO [8,25], ASO [26], GA [27], MI [28], GRA [29], PCC [30], CCA [31] |
Decomposition | EMD [32], EEMD [33], CEEMDAN [19,27,34,35,36,37,38], ICEEMDAN [39], SSA [40,41], VMD [42], SVMD [43] |
Normalization | [6,17,20,27,30,31,34,42,44,45,46,47,48,49,50,51,52,53,54,55] |
Models | CNN | LSTM/GRU/DBN | ||||
---|---|---|---|---|---|---|
Parameters | Number of Layers, Kernel Sizes, and Pooling | Number of Kernel Sizes | Number of Layers and Nodes | Number of Nodes | Number of Layers | |
Methods | ||||||
Grid search | [57,58] | [34] | [30] | |||
Bayesian | [59] | [2,42,53,60,61] | ||||
GA | [17] | |||||
IBFO | [46] | |||||
ABC | [62] | |||||
SOA | [33,43] | |||||
IWOA | [35] | [38,63,64] | ||||
IGOA | [37] | |||||
SCA | [65] | |||||
GWO | [66] | [67] | ||||
DEA | [68] | |||||
PSO/IPSO SPSO/SAPSO | [69] | [21] | [55] | [7,19,31,49,52,70,71] | [72] | |
tSBO | [25] | |||||
SSA | [73] | [74] | ||||
FWA | [75] | |||||
BER | [76] | |||||
Trial-and-error | [77] | [1,18,48,78,79,80] | [27,81,82] |
DNNR Models | Adam (Admax) | RMSProp | Unavailable |
---|---|---|---|
CNN | [17,26,34,40,41,57,58,62,77,85,87,88,89] | [25,39,59,66,69,90,91] | |
LSTM | [2,7,19,31,52,53,54,60,61,63,68,71,75,81,86,92,93,94] | [8,23,37,38,43,55,65,67,74,76,95,96,97] | |
GRU | [18,24,30,33,46,49,50,51,78,79,80] | [48,84] | [36,98,99,100] |
DBN | [6,20,101] | [1,28,32,42,44,45,64,70,72,73,82,102,103,104,105] |
Optimizers | Deep Learning Platforms | ||||
---|---|---|---|---|---|
Caffe | Matlab | PyTorch | TensorFlow with Keras | Theano with Keras | |
Adadelta | X | X | X | X | |
Adagrad | X | X | X | X | |
Adam | X | X | X | X | X |
Adamx | X | X | X | ||
Ftrl | X | X | |||
Nadam | X | X | |||
RMSprop | X | X | X | X | X |
SGD | X | X | X | X | X |
SparseAdam | X | ||||
Adamax | X | ||||
ASGD | X | ||||
LBFGS | X | ||||
NAdam | X | ||||
RAdam | X | ||||
Rprop | X | ||||
Nesterov | X |
Hyperparameters | Learning Rates | Number of Epochs | Batch Sizes | Iterations | Dropout Rates | |
---|---|---|---|---|---|---|
Methods | ||||||
Metaheuristics | PSO/IPSO/SPSO/SAPSO/PPSO | [7,21,49,70,72,94,102] | [7,49,102] | [7,44,49,52,55,71,102] | [82,102] | |
Hyper-Opt | [26] | [26] | ||||
SCA | [65] | |||||
IGOA | [37] | |||||
SOA/MTSO | [43] | [33] | [33] | [43] | ||
GWO | [67] | [66] | ||||
BER | [76] | |||||
WOA/IWOA | [35,63,64] | [63] | [35] | [35,38,64] | [63] | |
FWA | [75] | [75] | [75] | |||
IBFO | [46] | [46] | ||||
ABC | [62] | |||||
GA | [17] | |||||
ACO | [6] | [6] | [6] | |||
SSA | [74] | [73,74] | ||||
Others | Grid search | [30] | [30] | [30] | [34] | |
Bayesian | [42,59,89,96,101] | [2,43,53,61,101] | [29,53,61,101] | [42] | [61,77] | |
Unavailable | [1,8,23,27,28,84,85,92] | [1,8,28,78,84,85] | [1,23,28,79] | [1,18,27] | [42] |
Papers | Architectures | Optimizers | Hyperparameters | Data Preprocessing |
---|---|---|---|---|
[33] | X | X | X | X |
[17,30,34,35,37,38,42,43,46,53] | X | X | X | |
[2,59,61,62,63,64,65,66,67] | X | X | ||
[48,51] | X | X | ||
[6,21,26,29,44,49,52,55] | X | X | ||
[57,58,60,68] | X | |||
[84,85,86] | X | |||
[7,70,71,72,73,74,75,76,77,82,89,94,96,101,102] | X | |||
[1,8,16,18,19,20,22,23,24,25,27,28,31,32,36,39,40,41,45,47,50,54] | X |
Sections of Optimization | Types of Variables |
---|---|
Architectures | Integers |
Optimizers | Categories |
Hyperparameters | Integers, real numbers |
Data preprocessing | Categories |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, C.-H.; Lai, J.-P.; Chang, Y.-M.; Lai, C.-J.; Pai, P.-F. A Study of Optimization in Deep Neural Networks for Regression. Electronics 2023, 12, 3071. https://doi.org/10.3390/electronics12143071
Chen C-H, Lai J-P, Chang Y-M, Lai C-J, Pai P-F. A Study of Optimization in Deep Neural Networks for Regression. Electronics. 2023; 12(14):3071. https://doi.org/10.3390/electronics12143071
Chicago/Turabian StyleChen, Chieh-Huang, Jung-Pin Lai, Yu-Ming Chang, Chi-Ju Lai, and Ping-Feng Pai. 2023. "A Study of Optimization in Deep Neural Networks for Regression" Electronics 12, no. 14: 3071. https://doi.org/10.3390/electronics12143071
APA StyleChen, C.-H., Lai, J.-P., Chang, Y.-M., Lai, C.-J., & Pai, P.-F. (2023). A Study of Optimization in Deep Neural Networks for Regression. Electronics, 12(14), 3071. https://doi.org/10.3390/electronics12143071