Performance Comparison of CNN Models Using Gradient Flow Analysis
Abstract
:1. Introduction
2. Materials and Research Method
- Research question 1: How can we mathematically represent gradient flows based on a single bottleneck layer for CNN models with iterative bottleneck blocks?
- Research question 2: Do the analysis results for research question 1 coincide with the experimental results of error rate performance for various CNN models?
3. Research Results
4. Discussion
5. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Baseline Model | Architecture | Number of Param. | |
---|---|---|---|
VGG-19 | max pool max pool max pool max pool max pool (In case of CIFAR-10) softmax | 38.9 M | |
ResNet-50 | max pool (stride = 2) | 23.6 M | |
Bottleneck block(1) | |||
Bottleneck block(2) | |||
Bottleneck block(3) | |||
Bottleneck block(4) | |||
average pool FC-10 (In case of CIFAR-10) softmax | |||
SE-ResNet-50 | max pool (stride = 2) | 21.4 M | |
Bottleneck block(1) | |||
Bottleneck block(2) | |||
Bottleneck block(3) | |||
Bottleneck block(4) | |||
average pool FC-10 (In case of CIFAR-10) Softmax | |||
DenseNet-121 | (stride = 2) max pool (stride = 2) | 7 M | |
Dense block(1) +Dropout +Dropout | |||
Transition layer(1) average pool | |||
Dense block(2) +Dropout +Dropout | |||
Transition layer(2) average pool | |||
Dense block(3) +Dropout +Dropout | |||
Transition layer(3) average pool | |||
Dense block(4) +Dropout +Dropout | |||
average pool FC-10 (In case of CIFAR-10) Softmax |
References
- Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Chen, T. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
- Batmaz, Z.; Yurekli, A.; Bilge, A.; Kaleli, C. A review on deep learning for recommender systems: Challenges and remedies. Artif. Intell. Rev. 2019, 52, 1–37. [Google Scholar] [CrossRef]
- Chouhan, N.; Khan, A. Network anomaly detection using channel boosted and residual learning based deep convolutional neural network. Appl. Soft Comput. 2019, 83, 105612. [Google Scholar] [CrossRef]
- Wahab, N.; Khan, A.; Lee, Y.S. Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images. Microscopy 2019, 68, 216–233. [Google Scholar] [CrossRef] [Green Version]
- Jahmunah, V.; Ng, E.Y.K.; San, T.R.; Rajendra Acharya, U. Automated detection of coronary artery disease, myocardial infarction and congestive heart failure using GaborCNN model with ECG signals. Comput. Biol. Med. 2021, 134, 104457. [Google Scholar] [CrossRef] [PubMed]
- Soh, D.C.K.; Ng, E.Y.K.; Jahmunah, V.; Oh, S.L.; Tan, R.S.; Rajendra Acharya, U. Automated diagnostic tool for hypertension using convolutional neural network. Comput. Biol. Med. 2020, 126, 103999. [Google Scholar] [CrossRef]
- Zhou, W.; Chen, F.; Zong, Y.; Zhao, D.; JIE, B.; Wang, Z.; Huang, C.; Ng, E.Y.K. Automatical detection Approach for Bioresorbable Vascular Scaffolds Using U-shape Convolutional Neural Network. IEEE Access 2019, 7, 94424–94430. [Google Scholar] [CrossRef]
- Li, Y.; Zhang, Y.; Zhao, L.; Zhang, Y.; Liu, C.; Zhang, L.; Zhang, L.; Li, Z.; Wang, B.; Ng, E.Y.K.; et al. Combining convolutional neural network and distance distribution matrix for identification of congestive heart failure. IEEE Access 2018, 6, 39734–39744. [Google Scholar] [CrossRef]
- Canziani, A.; Paszke, A.; Culurciello, E. An analysis of deep neural network models for practical applications. arXiv 2016, arXiv:1605.07678. [Google Scholar]
- Lu, Z.; Rallapalli, S.; Chan, K.S.; La Porta, T. Modeling the resource requirements of convolutional neural networks on mobile devices. In Proceedings of the 25th ACM international conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 1663–1671. [Google Scholar]
- Hua, W.; Zhou, Y.; De Sa, C.; Zhang, A.; Edward Suh, G. Boosting the performance of CNN accelerators with dynamic fine-grained channel gating. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Columbums, OH, USA, 12–16 October 2019; pp. 139–150. [Google Scholar]
- Bengio, Y. Learning deep architectures for AI. Found. Trends Mach. Learn. 2009, 2, 1–55. [Google Scholar] [CrossRef]
- Bengio, Y.; Courville, A.; Vincent, P. Representation Learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
- LeCun, Y.; Bengio, Y.; Hinton, G.E. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. Int. Conf. Artif. Intell. Stat. 2010, 9, 249–256. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Gong, Z.; Zhong, P.; Hu, W. Statistical Loss and Analysis for Deep Learning in Hyperspectral Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 322–333. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Number of Weighted Layers | Plain Networks | ResNets |
---|---|---|
18 layers | 27.94% | 27.88% |
34 layers | 28.54% | 25.03% |
ResNet (Reimplementation) | SE-ResNet | |||||
---|---|---|---|---|---|---|
Top-1 err. (%) | Top-5 err. (%) | GFLOPs | top-1 err. (%) | top-5 err. (%) | GFLOPs | |
ResNet-50 | 24.80 | 7.48 | 3.86 | 23.29 | 6.62 | 3.87 |
ResNet-101 | 23.17 | 6.52 | 7.58 | 22.38 | 6.07 | 7.60 |
ResNet-152 | 22.42 | 6.34 | 11.30 | 21.57 | 5.73 | 11.32 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Noh, S.-H. Performance Comparison of CNN Models Using Gradient Flow Analysis. Informatics 2021, 8, 53. https://doi.org/10.3390/informatics8030053
Noh S-H. Performance Comparison of CNN Models Using Gradient Flow Analysis. Informatics. 2021; 8(3):53. https://doi.org/10.3390/informatics8030053
Chicago/Turabian StyleNoh, Seol-Hyun. 2021. "Performance Comparison of CNN Models Using Gradient Flow Analysis" Informatics 8, no. 3: 53. https://doi.org/10.3390/informatics8030053
APA StyleNoh, S.-H. (2021). Performance Comparison of CNN Models Using Gradient Flow Analysis. Informatics, 8(3), 53. https://doi.org/10.3390/informatics8030053