# Heuristic Method for Minimizing Model Size of CNN by Combining Multiple Pruning Techniques

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

- We have found a method for model compression that uses the structured pruning and unstructured pruning jointly. We also allow the CNN models to meet and maintain the target accuracy given by their applications. This will achieve the best reduction ratio regarding the parameters of the CNN models.
- We have developed an algorithm that achieves a better compression ratio than the individual usage of the structured and unstructured pruning methods under a target accuracy.
- We have shown the efficiency of our proposed method according to evaluations with five actual CNN models and validated the correctness of the algorithm.
- We also optimized our proposed algorithm to require significantly less computational time to find the best compression ratio compared to the case when we apply brute-force search algorithm.

## 2. Background and Definitions

#### 2.1. Object Detection Methods by CNNs

#### 2.2. Object Detection Methods by CNNs

**Network Slimming.**Figure 2 illustrates the steps of the network slimming. A channel-associated scaling factor is introduced in this method. The scaling factor is originally defined in [37] as the scale parameters for each batch normalization (BN) layer. As the modern CNN models always adopt a BN layer right after a convolutional layer, the scaling factors in BN layers can be directly leveraged for identifying the unimportant channels. While L1 regularization on scaling factors in the convolutional layers is imposed, the values of scaling factors corresponding to the unnecessary convolutional channels are pushed towards zero. The channels with small values (below a pre-defined threshold) of the scaling factors will be removed as shown in the figure on the left side in orange color. Through the above process, the CNN models can be efficiently compressed especially regarding the convolutional layers.

**Deep Compression.**Figure 3 shows a process of the weight pruning method used in the deep compression. It removes the connections shown in the left-side figure in blue color between neurons with weight values beneath a given threshold from the network structure. Moreover, during the reduction of connections, it also removes the neurons in orange color of the left-side figure without input or output connections. Due to the reduction of parameters (i.e., weights) especially for the dense parameters in the fully connected layers, the method finally achieves the compressed model for a CNN.

## 3. Heuristic Method for Minimizing Model Size of CNN

#### 3.1. Strategy for Minimizing the Model Size

**Pruning scheme.**We first apply the structured pruning method to a CNN model to compress the convolutional layers. The reduction ratio is gradually increased, and each ratio is applied to the original network until the model accuracy reaches beneath a given target accuracy. Here, the reduction ratio is defined as the percentage for compression given to the pruning process. Then, the pruning method switches to the unstructured pruning method for compressing the fully connected layers while also increasing the reduction ratio until the accuracy reaches beneath the target accuracy again. Through these two pruning methods, we can finally identify the model that is minimized in size. As we express mathematically above, assume $SP(m,p)$ is a compression of the structured pruning for a model m in a percentage of the reduction ratio p. It returns a pair of the compressed model ${m}^{\prime}$ and the accuracy $Acc$. By increasing p and comparing $Acc$ with the target accuracy, ${m}^{\prime}$ is passed to the unstructured pruning $USP({m}^{\prime},{p}^{\prime})$ when $Acc$ is less than the target accuracy. The unstructured pruning also returns the compressed model ${m}^{\u2033}$ and the accuracy $Ac{c}^{\prime}$. Finally, by increasing the reduction ratio ${p}^{\prime}$ and comparing $Ac{c}^{\prime}$ with the target accuracy, it finishes the compression at the reduction ratio right before ${p}^{\prime}$ results in worse accuracy than the target one.

**Initial Margin.**As illustrated above, the structured pruning method compresses a CNN model with every reduction ratio gradually increased until the accuracy does not meet the target one. The compressed model gained after the pruning process is minimized by this structured pruning. Here, if we directly pass the compressed model to the unstructured pruning method, it might not leave enough room for further compression by the unstructured pruning method. Therefore, our method introduces a margin for the reduction ratio after the former pruning method when ${m}^{\prime}$ is passed to the unstructured pruning. The reduction ratio p when ${m}^{\prime}$ is derived is drawn back for an amount of the margin value before the pruning method is switched to unstructured pruning.

**Parameters to be configured.**There are three parameters defined for our proposed method. The parameters are defined by application side. The first one is the target accuracy for the final compressed model. It affects the size of the final compressed model. The second one is the step_width to be used for increasing the reduction ratios p and ${p}^{\prime}$ for the structured and the unstructured pruning methods. A bigger step_width will help us accelerate the compression. However, it might miss the reduction ratio that achieves the minimum size of the compressed model because it passes over the reduction ratio by the large step_width used for incrementing the ratio. On the other hand, when the step_width is too small, we inevitably need to invoke a large number of iterations to find the minimized compressed model. This will result in a waste of calculation resources and execution time. The final parameter is the margin. As we mentioned above, the margin is used for finding a suitable reduction ratio for the structured pruning method. Hence, if the margin is configured to a large value, our method switches to invoke the unstructured pruning method to reduce the model size that is not compressed adequately by the structured pruning phase. In addition, when the margin is small, the model would be over-compressed by the structured pruning as it is difficult for further compression through the unstructured pruning.

**Step 1**applies network slimming to the initial model until the model’s accuracy becomes beneath the target accuracy (dotted line in gray color).

**Step 2**draws back (dotted line in yellow color) the reduction ratio during Step 1 according to the margin value (the square in red color) for obtaining the model that can be further pruned (the square in green color).

**Step 3**switches to the deep compression until the accuracy becomes beneath the target accuracy again. Here, we achieve the reduction ratio derived in the previous accuracy (the circle in blue color) for the deep compression. At the end, our method returns the minimum model derived from the Step 3. In addition, we will repeat Step 2 and Step 3 with different margin values for minimizing the model size. Through the above process, the convolutional layers and the fully connected layers will be effectively compressed. Furthermore, thus, we obtain a model that is minimized in size.

#### 3.2. Algorithms for Minimizing the Model Size

Algorithm 1 Pseudocode of algorithm for Minimizing Model Size |

## 4. Experimental Evaluations

#### 4.1. Experimental Setup

#### 4.2. Evaluation on Minimizing Performance for Model Size

#### 4.3. Evaluation for Calculation Overhead and Execution Time

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- LeCun, Y.; Boser, B.; Denker, J.; Henderson, D.; Howard, R.; Hubbard, W.; Jackel, L. Handwritten Digit Recognition with a Back-Propagation Network. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 27–30 November 1989; Touretzky, D., Ed.; Morgan-Kaufmann: Burlington, MA, USA, 1989; Volume 2. [Google Scholar]
- Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Doha, Qatar, 2014; pp. 1746–1751. [Google Scholar] [CrossRef][Green Version]
- Chen, C.; Seff, A.; Kornhauser, A.; Xiao, J. Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 11–18 December 2015; pp. 2722–2730. [Google Scholar] [CrossRef][Green Version]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef][Green Version]
- Zhiqiang, W.; Jun, L. A review of object detection based on convolutional neural network. In Proceedings of the 2017 36th Chinese Control Conference (CCC), Dalian, China, 26–28 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 11104–11109. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef][Green Version]
- Sultana, F.; Sufian, A.; Dutta, P. Advancements in Image Classification using Convolutional Neural Network. arXiv
**2018**, arXiv:1905.03288. [Google Scholar] - Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst.
**2012**, 25, 1–9. [Google Scholar] [CrossRef] - Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Berlin, Germany, 2014; pp. 818–833. [Google Scholar] [CrossRef][Green Version]
- Chauhan, R.; Ghanshala, K.K.; Joshi, R. Convolutional Neural Network (CNN) for Image Detection and Recognition. In Proceedings of the 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India, 15–17 December 2018; pp. 278–282. [Google Scholar] [CrossRef]
- Sarvadevabhatla, S.R.K.; Babu, R. A Taxonomy of Deep Convolutional Neural Nets for Computer Vision. Front. Robot. AI
**2016**, 2, 36. [Google Scholar] [CrossRef] - Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell.
**2018**, 40, 834–848. [Google Scholar] [CrossRef] [PubMed][Green Version] - Singh, A.; Agarwal, S.; Nagrath, P.; Saxena, A.; Thakur, N. Human Pose Estimation Using Convolutional Neural Networks. In Proceedings of the 2019 Amity International Conference on Artificial Intelligence (AICAI), Dubai, United Arab Emirates, 4–6 February 2019; pp. 946–952. [Google Scholar] [CrossRef]
- Dantone, M.; Gall, J.; Leistner, C.; Van Gool, L. Human Pose Estimation Using Body Parts Dependent Joint Regressors. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 3041–3048. [Google Scholar] [CrossRef][Green Version]
- Li, S.; Dou, Y.; Xu, J.; Wang, Q.; Niu, X. mmCNN: A Novel Method for Large Convolutional Neural Network on Memory-Limited Devices. In Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Tokyo, Japan, 23–27 July 2018; Volume 01, pp. 881–886. [Google Scholar] [CrossRef]
- He, Y.; Kang, G.; Dong, X.; Fu, Y.; Yang, Y. Soft filter pruning for accelerating deep convolutional neural networks. arXiv
**2018**, arXiv:1808.06866. [Google Scholar] [CrossRef][Green Version] - Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and <0.5 MB model size. arXiv
**2016**, arXiv:1602.07360. [Google Scholar] - Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv
**2017**, arXiv:1704.04861. [Google Scholar] - Liu, Z.; Li, J.; Shen, Z.; Huang, G.; Yan, S.; Zhang, C. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2736–2744. [Google Scholar] [CrossRef][Green Version]
- Luo, J.H.; Wu, J.; Lin, W. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 5058–5066. [Google Scholar] [CrossRef][Green Version]
- LeCun, Y.; Denker, J.; Solla, S. Optimal brain damage. Adv. Neural Inf. Process. Syst.
**1989**, 2, 598–605. [Google Scholar] - Han, S.; Mao, H.; Dally, W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv
**2015**, arXiv:1510.00149. [Google Scholar] - Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv
**2014**, arXiv:1409.1556. [Google Scholar] - He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef][Green Version]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef][Green Version]
- Woo, Y.; Kim, D.; Jeong, J.; Ko, Y.W.; Lee, J.G. Zero-Keep Filter Pruning for Energy/Power Efficient Deep Neural Networks. Electronics
**2021**, 10, 1238. [Google Scholar] [CrossRef] - Kim, Y.; Kong, J.; Munir, A. CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge. IEEE Access
**2020**, 8, 211422–211433. [Google Scholar] [CrossRef] - Chen, T.; Du, Z.; Sun, N.; Wang, J.; Wu, C.; Chen, Y.; Temam, O. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGARCH Comput. Archit. News
**2014**, 42, 269–284. [Google Scholar] [CrossRef] - Liu, Y.; Wang, Y.; Yu, R.; Li, M.; Sharma, V.; Wang, Y. Optimizing {CNN} Model Inference on {CPUs}. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC 19), Renton, WA, USA, 10–12 July 2019; pp. 1025–1040. [Google Scholar] [CrossRef]
- Chetlur, S.; Woolley, C.; Vandermersch, P.; Cohen, J.; Tran, J.; Catanzaro, B.; Shelhamer, E. cudnn: Efficient primitives for deep learning. arXiv
**2014**, arXiv:1410.0759. [Google Scholar] - Li, S.; Dou, Y.; Lv, Q.; Wang, Q.; Niu, X.; Yang, K. Optimized GPU Acceleration Algorithm of Convolutional Neural Networks for Target Detection. In Proceedings of the 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Sydney, NSW, Australia, 12–14 December 2016; pp. 224–230. [Google Scholar] [CrossRef]
- Vu, T.H.; Murakami, R.; Okuyama, Y.; Ben Abdallah, A. Efficient Optimization and Hardware Acceleration of CNNs towards the Design of a Scalable Neuro inspired Architecture in Hardware. In Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), Shanghai, China, 15–17 January 2018; pp. 326–332. [Google Scholar] [CrossRef]
- Wang, J.; Lin, J.; Wang, Z. Efficient Hardware Architectures for Deep Convolutional Neural Network. IEEE Trans. Circuits Syst. I Regul. Pap.
**2018**, 65, 1941–1953. [Google Scholar] [CrossRef] - Losh, M.; Llamocca, D. A low-power spike-like neural network design. Electronics
**2019**, 8, 1479. [Google Scholar] [CrossRef][Green Version] - Han, S.; Pool, J.; Tran, J.; Dally, W. Learning both weights and connections for efficient neural network. ADvances Neural Inf. Process. Syst.
**2015**, 28, 1135–1143. [Google Scholar] [CrossRef] - Lin, J.; Liu, Z.; Wang, H.; Han, S. AMC: AutoML for Model Compression and Acceleration on Mobile Devices. arXiv
**2018**, arXiv:1802.03494. [Google Scholar] - Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; PMLR: Lille, France; pp. 448–456. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images. Master’s Thesis, Department of Computer Science, University of Toronto, Toronto, ON, Canada, 2009. [Google Scholar]

**Figure 1.**A basic architecture of the Convolutional Neural Network (CNN). A typical CNN is mainly composed of a convolutional layer, a pooling layer and a fully connected layer.

**Figure 2.**The network slimming process. The channels (left side in orange color) with small scaling factor values (the numbers in orange color) will be eliminated.

**Figure 3.**The weight pruning process. The connections (left side in blue color) among neurons with small weight values will be eliminated. The neurons (left side in orange color) without any input or output connection will also be eliminated.

**Figure 5.**Compression iterations of our proposed method during the search process for the minimal accuracy. Step 1 is the iteration to find the minimal model by structured pruning. Step 2 is the iteration defined by the margin. Step 3 is the iteration to find the minimal model by the unstructured pruning.

**Figure 6.**The minimum model sizes derived by the proposed method when the initial margin is varied from 1 to 50 for VGG-19 when the target accuracy is 92%.

**Figure 7.**The minimum model sizes derived by the proposed method when the initial margin is varied from 1 to 50 for ResNet-110 when the target accuracy is 94%.

**Figure 8.**The minimum model sizes derived by the proposed method when the initial margin is varied from 1 to 50 for DenseNet-40 when the target accuracy is 92%.

**Figure 9.**The minimum model sizes derived by the proposed method when the initial margin is varied from 1 to 50 for DenseNet-121 when the target accuracy is 94%.

**Figure 10.**The minimum model sizes derived by the proposed method when the initial margin is varied from 1 to 50 for DenseNet-202 when the target accuracy is 95%.

Model | Accuracy | Model Size |
---|---|---|

VGG-19 | 93.99% | 80.34 MB |

ResNet-110 | 94.59% | 4.61 MB |

DenseNet-40 | 94.16% | 4.26 MB |

DenseNet-121 | 95.51% | 42.15 MB |

DenseNet-202 | 95.99% | 117.24 MB |

**Table 2.**The sizes of the minimized models in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for VGG-19.

Target Accuracy | Deep Compression | Network Slimming | Ours (Margin) |
---|---|---|---|

85% | 6.42 MB | 10.2 MB | 5.91 MB (4) |

90% | 7.23 MB | 10.2 MB | 6.11 MB (5) |

92% | 7.23 MB | 10.2 MB | 6.23 MB (4) |

**Table 3.**The sizes of the minimized models in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for ResNet-110.

Target Accuracy | Deep Compression | Network Slimming | Ours (Margin) |
---|---|---|---|

85% | 1.14 MB | 3.77 MB | 1.13 MB (20) |

90% | 1.46 MB | 3.87 MB | 1.44 MB (40) |

94% | 2.19 MB | 4.06 MB | 2.15 MB (7) |

**Table 4.**The sizes of the minimized models in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for DenseNet-40.

Target Accuracy | Deep Compression | Network Slimming | Ours (Margin) |
---|---|---|---|

85% | 0.71 MB | 1.81 MB | 0.68 MB (21) |

90% | 0.92 MB | 1.97 MB | 0.90 MB (16) |

92% | 1.13 MB | 2.02 MB | 1.12 MB (19) |

**Table 5.**The sizes of the minimized models in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for DenseNet-121.

Target Accuracy | Deep Compression | Network Slimming | Ours (Margin) |
---|---|---|---|

85% | 2.49 MB | 8.30 MB | 2.28 MB (22) |

90% | 2.90 MB | 8.72 MB | 2.49 MB (25) |

94% | 3.73 MB | 9.96 MB | 3.73 MB (4) |

**Table 6.**The sizes of the minimized models in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for DenseNet-202.

Target Accuracy | Deep Compression | Network Slimming | Ours (Margin) |
---|---|---|---|

85% | 4.69 MB | 15.10 MB | 3.52 MB (50) |

90% | 4.69 MB | 15.10 MB | 4.32 MB (29) |

95% | 7.03 MB | 19.07 MB | 6.87 MB (27) |

**Table 7.**The execution times in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for VGG-19.

Target Accuracy | Brute-Force Search Method | Ours (Number of Accuracy Check) |
---|---|---|

85% | 7.5 h | 0.5 h (656) |

90% | 7.5 h | 0.5 h (659) |

92% | 7.5 h | 0.5 h (654) |

**Table 8.**The execution times in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for ResNet-110.

Target Accuracy | Brute-Force Search Method | Ours (Number of Accuracy Check) |
---|---|---|

85% | 8.3 h | 0.45 h (597) |

90% | 8.3 h | 0.35 h (470) |

94% | 8.3 h | 0.30 h (388) |

**Table 9.**The execution times in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for DenseNet-40.

Target Accuracy | Brute-Force Search Method | Ours (Number of Accuracy Check) |
---|---|---|

85% | 7.5 h | 0.6 h (663) |

90% | 7.5 h | 0.6 h (678) |

92% | 7.5 h | 0.5 h (579) |

**Table 10.**The execution times in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for DenseNet-121.

Target Accuracy | Brute-Force Search Method | Ours (Number of Accuracy Check) |
---|---|---|

85% | 28 h | 2.1 h (737) |

90% | 28 h | 1.9 h (690) |

94% | 28 h | 2.0 h (705) |

**Table 11.**The execution times in the corresponding accuracies derived from the proposed method and the individual cases of network slimming and deep compression for DenseNet 202.

Target Accuracy | Brute-Force Search Method | Ours (Number of Accuracy Check) |
---|---|---|

85% | 92.5 h | 6.3 h (689) |

90% | 92.5 h | 6.35 h (719) |

95% | 92.5 h | 6.5 h (739) |

**Table 12.**The sizes of the minimized models in the corresponding accuracies derived from the individual cases of network slimming and deep compression, NS→DC and DC→NS for VGG-19.

Target Accuracy | Deep Compression | Network Slimming | NS→DC | DC→NS |
---|---|---|---|---|

85% | 6.42 MB | 10.2 MB | 5.91 MB | 7.23 MB |

90% | 7.23 MB | 10.2 MB | 6.11 MB | 8.03 MB |

92% | 7.23 MB | 10.2 MB | 6.23 MB | 8.03 MB |

**Table 13.**The sizes of the minimized models in the corresponding accuracies derived from the individual cases of network slimming and deep compression, NS→DC and DC→NS for ResNet-110.

Target Accuracy | Deep Compression | Network Slimming | NS→DC | DC→NS |
---|---|---|---|---|

85% | 1.14 MB | 3.77 MB | 1.13 MB | 1.17 MB |

90% | 1.46 MB | 3.87 MB | 1.44 MB | 1.47 MB |

94% | 2.19 MB | 4.06 MB | 2.15 MB | 2.19 MB |

**Table 14.**The sizes of the minimized models in the corresponding accuracies derived from the individual cases of network slimming and deep compression, NS→DC and DC→NS for DenseNet-40.

Target Accuracy | Deep Compression | Network Slimming | NS→DC | DC→NS |
---|---|---|---|---|

85% | 0.71 MB | 1.81 MB | 0.68 MB | 0.73 MB |

90% | 0.92 MB | 1.97 MB | 0.90 MB | 0.94 MB |

92% | 1.13 MB | 2.02 MB | 1.12 MB | 1.14 MB |

**Table 15.**The sizes of the minimized models in the corresponding accuracies derived from the individual cases of network slimming and deep compression, NS→DC and DC→NS for DenseNet-121.

Target Accuracy | Deep Compression | Network Slimming | NS→DC | DC→NS |
---|---|---|---|---|

85% | 2.49 MB | 8.30 MB | 2.28 MB | 2.55 MB |

90% | 2.90 MB | 8.72 MB | 2.49 MB | 2.96 MB |

94% | 3.73 MB | 9.96 MB | 3.28 MB | 3.89 MB |

**Table 16.**The sizes of the minimized models in the corresponding accuracies derived from the individual cases of network slimming and deep compression, NS→DC and DC→NS for DenseNet-202.

Target Accuracy | Deep Compression | Network Slimming | NS→DC | DC→NS |
---|---|---|---|---|

85% | 4.69 MB | 15.09 MB | 3.52 MB | 4.84 MB |

90% | 4.69 MB | 15.09 MB | 4.32 MB | 5.15 MB |

95% | 7.03 MB | 19.07 MB | 6.87 MB | 8.21 MB |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Tian, D.; Yamagiwa, S.; Wada, K.
Heuristic Method for Minimizing Model Size of CNN by Combining Multiple Pruning Techniques. *Sensors* **2022**, *22*, 5874.
https://doi.org/10.3390/s22155874

**AMA Style**

Tian D, Yamagiwa S, Wada K.
Heuristic Method for Minimizing Model Size of CNN by Combining Multiple Pruning Techniques. *Sensors*. 2022; 22(15):5874.
https://doi.org/10.3390/s22155874

**Chicago/Turabian Style**

Tian, Danhe, Shinichi Yamagiwa, and Koichi Wada.
2022. "Heuristic Method for Minimizing Model Size of CNN by Combining Multiple Pruning Techniques" *Sensors* 22, no. 15: 5874.
https://doi.org/10.3390/s22155874