AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network
Abstract
:1. Introduction
- We perform a theoretical analysis of the shortcomings associated with adaptive rounding and block-wise reconstruction.
- We introduce AE-Qdrop, a two-stage algorithm that includes block-wise reconstruction and global fine-tuning. AE-Qdrop combines a progressive optimization strategy with randomly weighted quantization activation, enhancing the accuracy and efficiency of block-wise reconstruction. Subsequently, global fine-tuning is applied to further optimize the weights, thereby improving the overall quantization accuracy.
- Extensive experiments are conducted to evaluate the quantization results of mainstream networks, demonstrating the excellent performance of AE-Qdrop in quantization accuracy and quantization efficiency.
2. Related Work
Bit-Width | Optimization Goal | Related Work |
---|---|---|
≥6 bit | The Quantization Error of Network Parameters | Optimizing Quantization Factor Scale [21,22] |
Bias Correction [27,28] | ||
Piecewise Linear Quantization [29,30] | ||
Outlier Separation [31,32] | ||
≤4 bit | Layer-wise Reconstruction | LAPQ [23] |
AdaRound [24] | ||
AdaQuant [35] | ||
Block-wise Reconstruction | BrecQ [25] | |
RAPQ [33] | ||
Mr.BiQ [34] | ||
Qdrop [26] |
3. Background and Theoretical Analysis
3.1. Quantizer
3.2. AdaRound
3.3. Drawbacks of Adaptive Rounding
3.4. Drawbacks of Block-Wise Reconstruction
4. AE-Qdrop
4.1. Block-Wise Reconstruction: Progressive Optimization Strategy
- Quantize the activation while keeping the weight unquantized. Optimize to absorb weight perturbations caused by activation quantization and then set the upper and lower bounds of according to Equation (10).
- Quantize the activation and maintain truncation calculation of the weight quantizer but disable the rounding calculation.
- Quantize both the activation and the weight.
4.2. Block-Wise Reconstruction: Randomly Weighted Quantized Activation
4.3. Global Fine-Tuning
5. Experimental Result
5.1. Experimental Setup and Implementation Details
5.2. Comprehensive Comparison
5.3. Ablation Study
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhu, A.; Wang, B.; Xie, J.; Ma, C. Lightweight Tunnel Defect Detection Algorithm Based on Knowledge Distillation. Electronics 2023, 12, 3222. [Google Scholar] [CrossRef]
- Wu, P.; Wang, Z.; Li, H.; Zeng, N. KD-PAR: A knowledge distillation-based pedestrian attribute recognition model with multi-label mixed feature learning network. Expert Syst. Appl. 2024, 237, 121305. [Google Scholar] [CrossRef]
- Lopes, V.; Carlucci, F.M.; Esperança, P.M.; Singh, M.; Yang, A.; Gabillon, V.; Xu, H.; Chen, Z.; Wang, J. Manas: Multi-agent neural architecture search. Mach. Learn. 2024, 113, 73–96. [Google Scholar] [CrossRef]
- Song, Y.; Wang, A.; Zhao, Y.; Wu, H.; Iwahori, Y. Multi-Scale Spatial–Spectral Attention-Based Neural Architecture Search for Hyperspectral Image Classification. Electronics 2023, 12, 3641. [Google Scholar] [CrossRef]
- Li, Y.; Adamczewski, K.; Li, W.; Gu, S.; Timofte, R.; Van Gool, L. Revisiting random channel pruning for neural network compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 191–201. [Google Scholar]
- Shen, W.; Wang, W.; Zhu, J.; Zhou, H.; Wang, S. Pruning-and Quantization-Based Compression Algorithm for Number of Mixed Signals Identification Network. Electronics 2023, 12, 1694. [Google Scholar] [CrossRef]
- Gholami, A.; Kim, S.; Dong, Z.; Yao, Z.; Mahoney, M.W.; Keutzer, K. A survey of quantization methods for efficient neural network inference. In Low-Power Computer Vision; Chapman and Hall/CRC: Boca Raton, FL, USA, 2022; pp. 291–326. [Google Scholar]
- Ahn, H.; Chen, T.; Alnaasan, N.; Shafi, A.; Abduljabbar, M.; Subramoni, H. Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version. arXiv 2023, arXiv:2303.05016. [Google Scholar]
- Liu, Z.; Cheng, K.T.; Huang, D.; Xing, E.P.; Shen, Z. Nonuniform-to-uniform quantization: Towards accurate quantization via generalized straight-through estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 4942–4952. [Google Scholar]
- Kim, D.; Lee, J.; Ham, B. Distance-aware quantization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 5271–5280. [Google Scholar]
- Peng, H.; Wu, J.; Zhang, Z.; Chen, S.; Zhang, H.T. Deep network quantization via error compensation. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4960–4970. [Google Scholar] [CrossRef] [PubMed]
- Esser, S.K.; McKinstry, J.L.; Bablani, D.; Appuswamy, R.; Modha, D.S. Learned Step Size quantization. In Proceedings of the 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Bhalgat, Y.; Lee, J.; Nagel, M.; Blankevoort, T.; Kwak, N. Lsq+: Improving low-bit quantization through learnable offsets and better initialization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 696–697. [Google Scholar]
- Lee, J.; Kim, D.; Ham, B. Network quantization with element-wise gradient scaling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 6448–6457. [Google Scholar]
- Li, Z.; Ni, B.; Li, T.; Yang, X.; Zhang, W.; Gao, W. Residual quantization for low bit-width neural networks. IEEE Trans. Multimed. 2021, 25, 214–227. [Google Scholar] [CrossRef]
- Xu, W.; Li, F.; Jiang, Y.; Yong, A.; He, X.; Wang, P.; Cheng, J. Improving extreme low-bit quantization with soft threshold. IEEE Trans. Circuits Syst. Video Technol. 2022, 33, 1549–1563. [Google Scholar] [CrossRef]
- Guo, N.; Bethge, J.; Meinel, C.; Yang, H. Join the high accuracy club on ImageNet with a binary neural network ticket. arXiv 2022, arXiv:2211.12933. [Google Scholar]
- Zhu, K.; He, Y.Y.; Wu, J. Quantized Feature Distillation for Network Quantization. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, Washington, DC, USA, 7–14 February 2023. [Google Scholar]
- Pei, Z.; Yao, X.; Zhao, W.; Yu, B. Quantization via distillation and contrastive learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–13. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Yang, B.; Yin, P.; Qi, Y.; Xin, J. Feature Affinity Assisted Knowledge Distillation and Quantization of Deep Neural Networks on Label-Free Data. arXiv 2023, arXiv:2302.10899. [Google Scholar] [CrossRef]
- Choukroun, Y.; Kravchik, E.; Yang, F.; Kisilev, P. Low-bit quantization of neural networks for efficient inference. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 3009–3018. [Google Scholar]
- Jeong, E.; Kim, J.; Tan, S.; Lee, J.; Ha, S. Deep learning inference parallelization on heterogeneous processors with tensorrt. IEEE Embed. Syst. Lett. 2021, 14, 15–18. [Google Scholar] [CrossRef]
- Nahshan, Y.; Chmiel, B.; Baskin, C.; Zheltonozhskii, E.; Banner, R.; Bronstein, A.M.; Mendelson, A. Loss aware post-training quantization. Mach. Learn. 2021, 110, 3245–3262. [Google Scholar] [CrossRef]
- Nagel, M.; Amjad, R.A.; Van Baalen, M.; Louizos, C.; Blankevoort, T. Up or down? adaptive rounding for post-training quantization. In Proceedings of the International Conference on Machine Learning. PMLR, Virtual, 13–18 July 2020; pp. 7197–7206. [Google Scholar]
- Li, Y.; Gong, R.; Tan, X.; Yang, Y.; Hu, P.; Zhang, Q.; Yu, F.; Wang, W.; Gu, S. BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction. In Proceedings of the 9th International Conference on Learning Representations, ICLR 2021, Virtual, 3–7 May 2021. [Google Scholar]
- Wei, X.; Gong, R.; Li, Y.; Liu, X.; Yu, F. QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization. In Proceedings of the Tenth International Conference on Learning Representations, ICLR 2022, Virtual, 25–29 April 2022. [Google Scholar]
- Nagel, M.; Baalen, M.v.; Blankevoort, T.; Welling, M. Data-free quantization through weight equalization and bias correction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1325–1334. [Google Scholar]
- Banner, R.; Nahshan, Y.; Soudry, D. Post training 4-bit quantization of convolutional networks for rapid-deployment. Adv. Neural Inf. Process. Syst. 2019, 7948–7956. [Google Scholar]
- Fang, J.; Shafiee, A.; Abdel-Aziz, H.; Thorsley, D.; Georgiadis, G.; Hassoun, J.H. Post-training piecewise linear quantization for deep neural networks. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 69–86. [Google Scholar]
- Park, D.; Lim, S.G.; Oh, K.J.; Lee, G.; Kim, J.G. Nonlinear depth quantization using piecewise linear scaling for immersive video coding. IEEE Access 2022, 10, 4483–4494. [Google Scholar] [CrossRef]
- Zhao, M.; Ning, K.; Yu, S.; Liu, L.; Wu, N. Quantizing Oriented Object Detection Network via Outlier-Aware Quantization and IoU Approximation. IEEE Signal Process. Lett. 2020, 27, 1914–1918. [Google Scholar] [CrossRef]
- Zhao, R.; Hu, Y.; Dotzel, J.; De Sa, C.; Zhang, Z. Improving neural network quantization without retraining using outlier channel splitting. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 7543–7552. [Google Scholar]
- Yao, H.; Li, P.; Cao, J.; Liu, X.; Xie, C.; Wang, B. RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23–29 July 2022; pp. 1573–1579. [Google Scholar] [CrossRef]
- Jeon, Y.; Lee, C.; Cho, E.; Ro, Y. Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; pp. 12319–12328. [Google Scholar] [CrossRef]
- Hubara, I.; Nahshan, Y.; Hanani, Y.; Banner, R.; Soudry, D. Accurate post training quantization with small calibration sets. In Proceedings of the International Conference on Machine Learning, PMLR, Online, 18–24 July 2021; pp. 4466–4475. [Google Scholar]
- Krishnamoorthi, R. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv 2018, arXiv:1806.08342. [Google Scholar]
- Baldi, P.; Sadowski, P.J. Understanding dropout. Adv. Neural Inf. Process. Syst. 2013, 2814–2822. [Google Scholar]
- Verma, V.; Lamb, A.; Beckham, C.; Najafi, A.; Mitliagkas, I.; Lopez-Paz, D.; Bengio, Y. Manifold mixup: Better representations by interpolating hidden states. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6438–6447. [Google Scholar]
Method | Bits(W/A) | Res18 | Res50 | MV2 | Reg600M | Reg3.2G | MNx2 |
---|---|---|---|---|---|---|---|
FP32 | 32/32 | 71.01 | 76.63 | 72.62 | 73.52 | 78.46 | 76.52 |
LAPQ | 4/4 | 60.30 | 70.00 | 49.70 | 57.71 | 55.89 | 65.32 |
AdaRound | 67.96 | 73.88 | 61.52 | 68.20 | 73.85 | 68.86 | |
BrecQ | 68.16 | 72.95 | 62.08 | 68.94 | 73.94 | 71.01 | |
Qdrop-4k | 69.05 | 74.79 | 67.72 | 70.60 | 76.21 | 72.57 | |
Qdrop | 69.16 | 74.91 | 67.86 | 70.95 | 76.45 | 72.81 | |
AE-Qdrop | 69.24 | 74.98 | 67.93 | 70.83 | 76.54 | 72.68 | |
AdaRound | 4/2 | 0.44 | 0.17 | 0.29 | 2.14 | 0.10 | 0.93 |
BrecQ | 31.19 | 16.95 | 0.28 | 4.22 | 3.47 | 6.34 | |
Qdrop-4k | 56.46 | 61.87 | 10.26 | 46.68 | 59.58 | 16.71 | |
Qdrop | 58.10 | 63.26 | 17.03 | 49.78 | 61.87 | 33.96 | |
AE-Qdrop | 58.48 | 64.53 | 29.10 | 52.71 | 64.29 | 42.32 | |
AdaRound | 2/2 | 0.39 | 0.13 | 0.12 | 0.79 | 0.11 | 0.40 |
BrecQ | 25.91 | 8.26 | 0.19 | 2.49 | 1.72 | 0.38 | |
Qdrop-4k | 46.12 | 48.81 | 6.18 | 31.30 | 48.38 | 16.37 | |
Qdrop | 51.55 | 55.21 | 9.97 | 39.31 | 53.88 | 24.21 | |
AE-Qdrop | 52.24 | 55.55 | 16.46 | 40.58 | 54.56 | 27.43 |
Res18 | Res50 | MV2 | Reg600M | Reg3.2G | MNx2 | |
---|---|---|---|---|---|---|
AdaRound | 19.4 | 65.1 | 42.8 | 38.6 | 75.6 | 58.9 |
BrecQ | 17.3 | 51.2 | 28.7 | 28.9 | 60.4 | 44.7 |
Qdrop | 19.1 | 64.4 | 37.7 | 32.8 | 74.8 | 58.4 |
Qdrop-4k | 4.6 | 15.1 | 8.6 | 7.4 | 16.8 | 13.8 |
BR | 2.4 | 8.7 | 4.7 | 3.8 | 9.7 | 7.8 |
GF | 1.9 | 5.8 | 2.9 | 2.8 | 7.1 | 5.1 |
AE-Qdrop | 4.3 | 13.5 | 7.6 | 6.6 | 16.8 | 12.9 |
Method | Bits (W/A) | MobileNetV1-SSD | MobileNetV2-SSD |
---|---|---|---|
FP32 | 32/32 | 67.60 | 68.70 |
Qdrop-4k | 4/4 | 63.46 | 63.91 |
Qdrop | 63.48 | 64.09 | |
AE-Qdrop | 64.21 | 65.10 | |
Qdrop-4k | 4/2 | 36.77 | 24.81 |
Qdrop | 38.61 | 28.10 | |
AE-Qdrop | 44.67 | 37.89 | |
Qdrop-4k | 2/2 | 21.98 | 19.58 |
Qdrop | 30.18 | 26.45 | |
AE-Qdrop | 31.69 | 28.04 |
Method | Res18 | Res50 | MV2 | Reg600M | Reg3.2G | MNx2 |
---|---|---|---|---|---|---|
Baseline | 46.40 | 47.90 | 6.44 | 27.73 | 41.17 | 15.72 |
Baseline+RDQA | 50.00 | 52.29 | 7.52 | 36.29 | 52.89 | 16.64 |
Baseline+RWQA | 51.05 | 52.89 | 8.78 | 36.92 | 53.51 | 20.35 |
Baseline+POS | 47.12 | 49.55 | 11.10 | 28.75 | 41.74 | 20.49 |
Baseline+RWQA+POS | 51.73 | 55.36 | 13.33 | 39.06 | 54.32 | 24.61 |
Baseline+RWQA+POS+GF | 52.24 | 55.55 | 16.46 | 40.58 | 54.56 | 27.43 |
Method | Res18 | Res50 | MV2 | Reg600M | Reg3.2G | MNx2 | |
---|---|---|---|---|---|---|---|
W4A4 | MSE | 49.75 | 65.54 | 22.40 | 51.70 | 66.75 | 49.71 |
MSE+GF | 65.05 | 69.10 | 36.69 | 60.05 | 70.24 | 56.68 | |
W4A2 | MSE | 9.33 | 4.35 | 0.11 | 1.9 | 2.01 | 0.27 |
MSE+GF | 25.18 | 6.98 | 0.18 | 3.3 | 4.43 | 0.28 | |
W2A2 | MSE | 0.08 | 0.16 | 0.11 | 0.15 | 0.11 | 0.10 |
MSE+GF | 0.08 | 0.10 | 0.09 | 0.16 | 0.17 | 0.10 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Chen, G.; Jin, M.; Mao, W.; Lu, H. AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network. Electronics 2024, 13, 644. https://doi.org/10.3390/electronics13030644
Li J, Chen G, Jin M, Mao W, Lu H. AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network. Electronics. 2024; 13(3):644. https://doi.org/10.3390/electronics13030644
Chicago/Turabian StyleLi, Jixing, Gang Chen, Min Jin, Wenyu Mao, and Huaxiang Lu. 2024. "AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network" Electronics 13, no. 3: 644. https://doi.org/10.3390/electronics13030644
APA StyleLi, J., Chen, G., Jin, M., Mao, W., & Lu, H. (2024). AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network. Electronics, 13(3), 644. https://doi.org/10.3390/electronics13030644