Zero-Keep Filter Pruning for Energy/Power Efficient Deep Neural Networks †
Abstract
:1. Introduction
1.1. Related Works
2. Zero-Keep Filter Pruning
Algorithm 1 Zero-keep filter pruning algorithm. |
Input Pre-trained Network , Pruning rate n% Output Pruned Network
|
3. Experiment
3.1. Setting Up of the Experiment
- 1.
- Filter pruning happens in the layers from 1st to 13th.
- 2.
- Filter pruning happens in the layers from 5th to 13th.
- 3.
- Filter pruning happens in the layers from 1st to 11th.
- 4.
- Filter pruning happens in the layers from 5th to 11th.
3.2. Results
- The computational benefit of the proposed scheme: In order to investigate possible impact of the proposed scheme on the computation and energy/power efficiencies, we tried to find out what percentage of the total multiplications can be skipped. To find the number of the multiplications that can be skipped, we counted the number of required multiplications in the convolutional layers in our deep learning models and checked the output of the multiplications. Since the ‘zero’ outputs of the multiplications imply that any of both input is zero, we can count the number of the multiplications that can be skipped without consuming circuit switching power/energy with a specially designed multiplier that will be discussed in the following section. Please note that one input operand of the multiplications comes from convolution filters while the other input operand comes from feature maps.
4. Discussion: Zero-Skip Multiplication
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Dong, X.; Yang, Y. Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 1761–1770. [Google Scholar]
- Greenspan, H.; van Ginneken, B.; Summers, R.M. Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Trans. Med. Imaging 2016, 35, 1153–1159. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; Weinberger, K.Q.; Maaten, L. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1–4. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015; pp. 1–14. [Google Scholar]
- He, Y.; Kang, G.; Dong, X.; Fu, Y.; Yang, Y. Soft filter pruning for accelerating deep convolutional neural networks. In Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI), Stockholm, Sweden, 13–19 July 2018; pp. 2234–2240. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More features from cheap operations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Online Conference, 14–19 June 2020; pp. 1580–1589. [Google Scholar]
- Mittal, D.; Bhardwaj, S.; Khapra, M.M.; Ravindran, B. Recovering from random pruning: On the plasticity of deep convolutional neural networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 848–857. [Google Scholar]
- Molchanov, P.; Tyree, S.; Karras, T.; Aila, T.; Kautz, J. Pruning convolutional neural networks for resource efficient transfer learning. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017; pp. 1–17. [Google Scholar]
- Molchanov, P.; Mallya, A.; Tyree, S.; Frosio, I.; Kautz, J. Importance estimation for neural network pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 11264–11272. [Google Scholar]
- Liang, T.; Glossner, J.; Wang, L.; Shi, S. Pruning and Quantization for Deep Neural Network Acceleration: A Survey. arXiv 2021, arXiv:2101.09671. [Google Scholar]
- Han, S.; Pool, J.; Tran, J.; Dally, W.J. Learning both weights and connections for efficient neural network. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Montréal, QC, Canada, 7–12 December 2015; pp. 1135–1143. [Google Scholar]
- Li, H.; Kadav, A.; Durdanovic, I.; Samet, H.; Graf, H.P. Pruning filters for efficient ConvNets. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
- Hu, H.; Peng, R.; Tai, Y.W.; Tang, C.K. Network trimming: A datadriven neuron pruning approach towards efficient deep architectures. arXiv 2016, arXiv:1607.03250. [Google Scholar]
- Zhou, Z.; Zhou, W.; Li, H.; Hong, R. Online filter clustering and pruning for efficient convnets. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 11–15. [Google Scholar]
- He, Y.; Liu, P.; Wang, Z.; Hu, Z.; Yang, Y. Filter pruning via geometric median for deep convolutional neural networks acceleration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 4340–4349. [Google Scholar]
- Lin, M.; Ji, R.; Wang, Y.; Zhang, Y.; Zhang, B.; Tian, Y.; Shao, L. HRank: Filter pruning using high-rank feature map. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Online Conference, 14–19 June 2020; pp. 1529–1538. [Google Scholar]
- Meng, F.; Cheng, H.; Li, K.; Luo, H.; Guo, X.; Lu, G.; Sun, X. Pruning Filter in Filter. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Online Conference, 6–14 December 2020. [Google Scholar]
- Frankle, J.; Carbin, M. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Liu, Z.; Sun, M.; Zhou, T.; Huang, G.; Darrell, T. Rethinking the Value of Network Pruning. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Wang, Y.; Zhang, X.; Xie, L.; Zhou, J.; Su, H.; Zhang, B.; Hu, X. Pruning from Scratch. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020. [Google Scholar]
- Movva, R.; Frankle, J.; Carbin, M. Studying the Consistency and Composability of Lottery Ticket Pruning Masks. In Proceedings of the Workshop on International Conference on Learning Representations (ICLR), Online Conference, 8 May 2021. [Google Scholar]
- Bottou, L.; Curtis, F.E.; Nocedal, J. Optimization methods for large-scale machine learning. SIAM Rev. 2018, 60, 223–311. [Google Scholar] [CrossRef]
- Kingma, D.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations(ICLR), San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
- Dozat, T. Incorporating Nesterov momentum into Adam. In Proceedings of the International Conference on Learning Representations(ICLR) Workshop Track, San Juan, Puerto Rico, 2–4 May 2016; pp. 1–19. [Google Scholar]
- Luo, L.; Xiong, Y.; Liu, Y.; Sun, X. Adaptive gradient methods with dynamic bound of learning rate. In Proceedings of the International Conference on Learning Representations(ICLR), New Orleans, LA, USA, 6–9 May 2019; pp. 1–19. [Google Scholar]
- Kim, D.; Ahn, J.; Yoo, S. ZeNA: Zero-Aware Neural Network Accelerator. IEEE Des. Test 2018, 35, 39–46. [Google Scholar] [CrossRef]
- Ganesan, V.; Sen, S.; Kumar, P.; Gala, N.; Veezhinathan, K.; Raghunathan, A. Sparsity-Aware Caches to Accelerate Deep Neural Networks. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE), Grenoble, France, 9–13 March 2020; pp. 85–90. [Google Scholar]
Device | Description |
---|---|
Processor | Intel Core i9-10980XE 3.00 GHz |
Memory | 128 GB |
GPU | GeForce RTX 3090 × 4 |
Software | Description |
Operating System | Ubuntu 18.04 LTS 64 bit |
Programming Language | Python 3.7.9 |
Deep Learning Library | Pytorch 1.7.1 |
Method | Pruning Rate (%) | NZER ORIG (%) | NZER (%) | Number of Filters | Accuracy (%) |
---|---|---|---|---|---|
0 | 100 | 100 | 4224 | 94.15 | |
5 | 90.53 | 100 | 4020 | 92.72 | |
10 | 81.15 | 100 | 3807 | 92.86 | |
15 | 72.52 | 100 | 3598 | 92.53 | |
Random | 20 | 64.15 | 100 | 3385 | 92.19 |
Pruning | 25 | 56.25 | 100 | 3168 | 92.35 |
30 | 49.21 | 100 | 2964 | 91.88 | |
35 | 42.36 | 100 | 2751 | 91.57 | |
40 | 36.20 | 100 | 2542 | 91.45 | |
45 | 30.36 | 100 | 2329 | 91.40 | |
0 | 100 | 100 | 4224 | 94.15 | |
5 | 85.92 | 94.97 | 4019 | 92.84 | |
10 | 72.13 | 88.95 | 3806 | 92.70 | |
15 | 59.44 | 82.02 | 3597 | 92.62 | |
Ours | 20 | 47.51 | 74.11 | 3384 | 92.52 |
25 | 36.58 | 65.09 | 3167 | 91.77 | |
30 | 26.97 | 54.85 | 2963 | 92.01 | |
35 | 18.26 | 43.15 | 2750 | 91.74 | |
40 | 10.79 | 29.85 | 2541 | 91.50 | |
45 | 4.49 | 14.80 | 2328 | 90.65 |
Method | Pruning Rate (%) | NZER ORIG (%) | NZER (%) | Number of Filters | Accuracy (%) |
---|---|---|---|---|---|
0 | 100 | 100 | 4224 | 94.15 | |
5 | 90.78 | 100 | 4038 | 92.74 | |
10 | 81.63 | 100 | 3843 | 93.02 | |
15 | 73.25 | 100 | 3654 | 93.21 | |
Random | 20 | 65.08 | 100 | 3459 | 92.85 |
Pruning | 25 | 57.40 | 100 | 3264 | 92.77 |
30 | 50.52 | 100 | 3078 | 92.51 | |
35 | 43.81 | 100 | 2883 | 92.32 | |
40 | 37.80 | 100 | 2694 | 92.43 | |
45 | 34.75 | 100 | 2499 | 92.28 | |
0 | 100 | 100 | 4224 | 94.15 | |
5 | 86.25 | 95.07 | 4037 | 92.89 | |
10 | 72.77 | 89.20 | 3842 | 92.84 | |
15 | 60.39 | 82.50 | 3653 | 92.81 | |
Ours | 20 | 48.70 | 74.90 | 3458 | 92.82 |
25 | 38.02 | 66.29 | 3263 | 92.91 | |
30 | 28.58 | 56.62 | 3077 | 92.62 | |
35 | 20.01 | 45.72 | 2882 | 92.63 | |
40 | 12.67 | 33.56 | 2693 | 92.35 | |
45 | 6.45 | 20.1 | 2498 | 92.28 |
Index of Layer | Original | Random Pruning (NZER_ORIG (%)) | Ours (NZER_ORIG (%)) | Random/Ours |
---|---|---|---|---|
1 | 1728 | 972 (56.25) | 322 (18.63) | 3 |
2 | 36,864 | 11,664 (31.64) | 1505 (4.08) | 7.8 |
3 | 73,728 | 23,004 (31.2) | 4621 (6.27) | 5 |
4 | 147,456 | 45,369 (30.77) | 9316 (6.32) | 4.9 |
5 | 294,912 | 90,099 (30.55) | 20,303 (6.88) | 4.4 |
6 | 589,824 | 178,929 (30.34) | 39,195 (6.65) | 4.6 |
7 | 589,824 | 178,929 (30.34) | 40,515 (6.87) | 4.4 |
8 | 1,179,648 | 357,858 (30.34) | 70,915 (6.01) | 5 |
9 | 2,359,296 | 715,716 (30.34) | 103,277 (4.38) | 6.9 |
10 | 2,359,296 | 715,716 (30.34) | 105,344 (4.47) | 6.8 |
11 | 2,359,296 | 715,716 (30.34) | 93,104 (3.95) | 7.7 |
12 | 2,359,296 | 715,716 (30.34) | 92,520 (3.92) | 7.7 |
13 | 2,359,296 | 715,716 (30.34) | 78,970 (3.35) | 9.1 |
Total | 14,710,464 | 4,465,404 (30.36) | 659,907 (4.49) | 5.9 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Woo, Y.; Kim, D.; Jeong, J.; Ko, Y.-W.; Lee, J.-G. Zero-Keep Filter Pruning for Energy/Power Efficient Deep Neural Networks. Electronics 2021, 10, 1238. https://doi.org/10.3390/electronics10111238
Woo Y, Kim D, Jeong J, Ko Y-W, Lee J-G. Zero-Keep Filter Pruning for Energy/Power Efficient Deep Neural Networks. Electronics. 2021; 10(11):1238. https://doi.org/10.3390/electronics10111238
Chicago/Turabian StyleWoo, Yunhee, Dongyoung Kim, Jaemin Jeong, Young-Woong Ko, and Jeong-Gun Lee. 2021. "Zero-Keep Filter Pruning for Energy/Power Efficient Deep Neural Networks" Electronics 10, no. 11: 1238. https://doi.org/10.3390/electronics10111238
APA StyleWoo, Y., Kim, D., Jeong, J., Ko, Y.-W., & Lee, J.-G. (2021). Zero-Keep Filter Pruning for Energy/Power Efficient Deep Neural Networks. Electronics, 10(11), 1238. https://doi.org/10.3390/electronics10111238