Batch Gradient Learning Algorithm with Smoothing Regularization for Feedforward Neural Networks
Abstract
:1. Introduction
2. Network Structure and Learning Algorithm Methodology
2.1. Network Structure
2.2. Modified Error Function with Smoothing Regularization (BGS)
3. Materials and Methods
- I.
- II.
- There exists such that
- III.
- IV.
- Further, if proposition 4 is also valid, we have the following strong convergence
- V.
- There exists a point such that .
Algorithm 1 The learning algorithm | |
Input | Input the dimension , the number of the nodes, the number maximum iteration number , the learning rate , the regularization parameter , and the sample training set is . |
Initialization | Initialize randomly the initial weight vectors and |
Training | For do Compute the error function Equation (10). Compute the gradients Equation (15). Update the weights and by using Equation (14). end |
Output | Output the final weight vectors and |
4. Experimental Results
4.1. N-Dimensional Parity Problems
4.2. Function Approximation Problem
5. Discussion
6. Conclusions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
- (a)
- ;
- (b)
- .
References
- Deperlioglu, O.; Kose, U. An educational tool for artificial neural networks. Comput. Electr. Eng. 2011, 37, 392–402. [Google Scholar] [CrossRef]
- Abu-Elanien, A.E.; Salama, M.M.A.; Ibrahim, M. Determination of transformer health condition using artificial neural networks. In Proceedings of the 2011 International Symposium on Innovations in Intelligent Systems and Applications, Istanbul, Turkey, 15–18 June 2011; pp. 1–5. [Google Scholar]
- Huang, W.; Lai, K.K.; Nakamori, Y.; Wang, S.; Yu, L. Neural networks in finance and economics forecasting. Int. J. Inf. Technol. Decis. Mak. 2007, 6, 113–140. [Google Scholar] [CrossRef]
- Papic, C.; Sanders, R.H.; Naemi, R.; Elipot, M.; Andersen, J. Improving data acquisition speed and accuracy in sport using neural networks. J. Sport. Sci. 2021, 39, 513–522. [Google Scholar] [CrossRef]
- Pirdashti, M.; Curteanu, S.; Kamangar, M.H.; Hassim, M.H.; Khatami, M.A. Artificial neural networks: Applications in chemical engineering. Rev. Chem. Eng. 2013, 29, 205–239. [Google Scholar] [CrossRef]
- Li, J.; Cheng, J.H.; Shi, J.Y.; Huang, F. Brief introduction of back propagation (BP) neural network algorithm and its improvement. In Advances in Computer Science and Information Engineering; Springer: Berlin/Heidelberg, Germany, 2012; pp. 553–558. [Google Scholar]
- Hoi, S.C.; Sahoo, D.; Lu, J.; Zhao, P. Online learning: A comprehensive survey. Neurocomputing 2021, 459, 249–289. [Google Scholar] [CrossRef]
- Fukumizu, K. Effect of batch learning in multilayer neural networks. Gen 1998, 1, 1E-03. [Google Scholar]
- Hawkins, D.M. The problem of overfitting. J. Chem. Inf. Comput. Sci. 2004, 44, 1–12. [Google Scholar] [CrossRef]
- Dietterich, T. Overfitting and undercomputing in machine learning. ACM Comput. Surv. 1995, 27, 326–327. [Google Scholar] [CrossRef]
- Everitt, B.S.; Skrondal, A. The Cambridge Dictionary of Statistics; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
- Moore, A.W. Cross-Validation for Detecting and Preventing Overfitting; School of Computer Science, Carnegie Mellon University: Pittsburgh, PA, USA, 2001. [Google Scholar]
- Yao, Y.; Rosasco, L.; Caponnetto, A. On early stopping in gradient descent learning. Constr. Approx. 2007, 26, 289–315. [Google Scholar] [CrossRef]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Santos, C.F.G.D.; Papa, J.P. Avoiding overfitting: A survey on regularization methods for convolutional neural networks. ACM Comput. Surv. 2022, 54, 1–25. [Google Scholar] [CrossRef]
- Waseem, M.; Lin, Z.; Yang, L. Data-driven load forecasting of air conditioners for demand response using levenberg–marquardt algorithm-based ANN. Big Data Cogn. Comput. 2019, 3, 36. [Google Scholar] [CrossRef]
- Waseem, M.; Lin, Z.; Liu, S.; Jinai, Z.; Rizwan, M.; Sajjad, I.A. Optimal BRA based electric demand prediction strategy considering instance-based learning of the forecast factors. Int. Trans. Electr. Energy Syst. 2021, 31, e12967. [Google Scholar] [CrossRef]
- Alemu, H.Z.; Wu, W.; Zhao, J. Feedforward neural networks with a hidden layer regularization method. Symmetry 2018, 10, 525. [Google Scholar] [CrossRef] [Green Version]
- Li, F.; Zurada, J.M.; Liu, Y.; Wu, W. Input layer regularization of multilayer feedforward neural networks. IEEE Access 2017, 5, 10979–10985. [Google Scholar] [CrossRef]
- Mohamed, K.S.; Wu, W.; Liu, Y. A modified higher-order feed forward neural network with smoothing regularization. Neural Netw. World 2017, 27, 577–592. [Google Scholar] [CrossRef] [Green Version]
- Reed, R. Pruning algorithms-a survey. IEEE Trans. Neural Netw. 1993, 4, 740–747. [Google Scholar] [CrossRef]
- Setiono, R. A penalty-function approach for pruning feedforward neural networks. Neural Comput. 1997, 9, 185–204. [Google Scholar] [CrossRef] [PubMed]
- Nakamura, K.; Hong, B.W. Adaptive weight decay for deep neural networks. IEEE Access 2019, 7, 118857–118865. [Google Scholar] [CrossRef]
- Bosman, A.; Engelbrecht, A.; Helbig, M. Fitness landscape analysis of weight-elimination neural networks. Neural Process. Lett. 2018, 48, 353–373. [Google Scholar] [CrossRef]
- Rosato, A.; Panella, M.; Andreotti, A.; Mohammed, O.A.; Araneo, R. Two-stage dynamic management in energy communities using a decision system based on elastic net regularization. Appl. Energy 2021, 291, 116852. [Google Scholar] [CrossRef]
- Pan, C.; Ye, X.; Zhou, J.; Sun, Z. Matrix regularization-based method for large-scale inverse problem of force identification. Mech. Syst. Signal Process. 2020, 140, 106698. [Google Scholar] [CrossRef]
- Liang, S.; Yin, M.; Huang, Y.; Dai, X.; Wang, Q. Nuclear norm regularized deep neural network for EEG-based emotion recognition. Front. Psychol. 2022, 13, 924793. [Google Scholar] [CrossRef]
- Candes, E.J.; Tao, T. Decoding by linear programming. IEEE Trans. Inf. Theory 2005, 51, 4203–4215. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Liu, P.; Li, Z.; Sun, T.; Yang, C.; Zheng, Q. Data regularization using Gaussian beams decomposition and sparse norms. J. Inverse Ill Posed Probl. 2013, 21, 1–23. [Google Scholar] [CrossRef]
- Zhang, H.; Tang, Y. Online gradient method with smoothing ℓ0 regularization for feedforward neural networks. Neurocomputing 2017, 224, 1–8. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Koneru, B.N.G.; Vasudevan, V. Sparse artificial neural networks using a novel smoothed LASSO penalization. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 848–852. [Google Scholar] [CrossRef]
- Xu, Z.; Zhang, H.; Wang, Y.; Chang, X.; Liang, Y. L1/2 regularization. Sci. China Inf. Sci. 2010, 53, 1159–1169. [Google Scholar] [CrossRef] [Green Version]
- Wu, W.; Fan, Q.; Zurada, J.M.; Wang, J.; Yang, D.; Liu, Y. Batch gradient method with smoothing L1/2 regularization for training of feedforward neural networks. Neural Netw. 2014, 50, 72–78. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Yang, D.; Zhang, C. Relaxed conditions for convergence analysis of online back-propagation algorithm with L2 regularizer for Sigma-Pi-Sigma neural network. Neurocomputing 2018, 272, 163–169. [Google Scholar] [CrossRef]
- Mohamed, K.S.; Liu, Y.; Wu, W.; Alemu, H.Z. Batch gradient method for training of Pi-Sigma neural network with penalty. Int. J. Artif. Intell. Appl. IJAIA 2016, 7, 11–20. [Google Scholar] [CrossRef]
- Zhang, H.; Wu, W.; Liu, F.; Yao, M. Boundedness and convergence of online gradient method with penalty for feedforward neural networks. IEEE Trans. Neural Netw. 2009, 20, 1050–1054. [Google Scholar] [CrossRef]
- Zhang, H.; Wu, W.; Yao, M. Boundedness and convergence of batch back-propagation algorithm with penalty for feedforward neural networks. Neurocomputing 2012, 89, 141–146. [Google Scholar] [CrossRef]
- Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Tsinghua University Press: Beijing, China; Prentice Hall: Hoboken, NJ, USA, 2001. [Google Scholar]
- Liu, Y.; Wu, W.; Fan, Q.; Yang, D.; Wang, J. A modified gradient learning algorithm with smoothing L1/2 regularization for Takagi–Sugeno fuzzy models. Neurocomputing 2014, 138, 229–237. [Google Scholar] [CrossRef]
- Iyoda, E.M.; Nobuhara, H.; Hirota, K. A solution for the n-bit parity problem using a single translated multiplicative neuron. Neural Process. Lett. 2003, 18, 233–238. [Google Scholar] [CrossRef]
Problems | Network Structure | Weight Size | Max Iteration | LR | RP |
---|---|---|---|---|---|
3-bit parity | 3-6-1 | [−0.5, 0.5] | 2000 | 0.009 | 0.0003 |
6-bit parity | 6-20-1 | [−0.5, 0.5] | 3000 | 0.006 | 0.003 |
Problems | Learning Algorithms | Average Error | Norm of Gradient | Time (s) |
---|---|---|---|---|
3-bit parity | BG | 3.7979 × 10−7 | 0.0422 | 1.156248 |
BG | 5.4060 × 10−7 | 7.1536 × 10−4 | 1.216248 | |
BG | 9.7820 × 10−7 | 8.7826 × 10−4 | 1.164721 | |
BGS | 1.7951 × 10−8 | 0.0011 | 1.155829 | |
BGS | 7.6653 × 10−9 | 7.9579 × 10−5 | 1.135742 | |
6-bit parity | BG | 8.1281 × 10−5 | 1.1669 | 52.225856 |
BG | 3.8917 × 10−5 | 0.0316 | 52.359129 | |
BG | 4.1744 × 10−5 | 0.0167 | 52.196552 | |
BGS | 4.8349 × 10−5 | 0.0088 | 52.210994 | |
BGS | 4.1656 × 10−6 | 0.0015 | 52.106554 |
Learning Algorithms | Average Error | Norm of Gradient | Time (s) |
---|---|---|---|
BG | 0.0388 | 0.3533 | 4.415500 |
BG | 0.0389 | 0.3050 | 4.368372 |
BG | 0.0390 | 0.3087 | 4.368503 |
BGS | 0.0386 | 0.2999 | 4.349813 |
BGS | 0.0379 | 0.2919 | 4.320198 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mohamed, K.S.
Batch Gradient Learning Algorithm with Smoothing
Mohamed KS.
Batch Gradient Learning Algorithm with Smoothing
Mohamed, Khidir Shaib.
2023. "Batch Gradient Learning Algorithm with Smoothing
Mohamed, K. S.
(2023). Batch Gradient Learning Algorithm with Smoothing