Network pruning techniques have been widely used for compressing computational and memory intensive deep learning models through removing redundant components of the model. According to the pruning granularity, network pruning can be categorized into structured and unstructured methods. The structured pruning removes the large components in a model such as channels or layers, which might reduce the accuracy. The unstructured pruning directly removes mainly the parameters in a model as well as the redundant channels or layers, which might result in an inadequate pruning. To address the limitations of the pruning methods, this paper proposes a heuristic method for minimizing model size. This paper implements an algorithm to combine both the structured and the unstructured pruning methods while maintaining the target accuracy that is configured by its application. We use network slimming for the structured pruning method and deep compression for the unstructured one. Our method achieves a higher compression ratio than the case when the individual pruning method is applied. To show the effectiveness of our proposed method, this paper evaluates our proposed method with actual state-of-the-art CNN models of VGGNet, ResNet and DenseNet under the CIFAR-10 dataset. This paper discusses the performance of the proposed method with the cases of individual usage of the structured and unstructured pruning methods and then proves that our method achieves better performance with higher compression ratio. In the best case of the VGGNet, our method results in a 13× reduction ratio in the model size, and also gives a 15× reduction ratio regarding the pruning time compared with the brute-force search method.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.