Image Compression Network Structure Based on Multiscale Region of Interest Attention Network
Abstract
:1. Introduction
- A multiscale ROI spatial attention module was designed for the image compression network.
- A multiscale interest autocoding network with hierarchical super-priority layers was designed to comprehensively analyze the image and more effectively reduce spatial redundancy, thus greatly improving the rate distortion performance of image compression and achieving a superior compression performance in the ROI by using a spatial attention mechanism for the ROI in the image compression network.
2. Related Work
Image Compression Network Structure Based on Super-Prior Architecture
3. Proposed Method
3.1. Motivation
3.2. The Attention Module Based on the Region of Interest
3.3. Framework
4. Experimental Results
4.1. Performance
4.2. The Dataset
4.3. Experimental Environment and Parameter Settings
4.4. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
ROI | Region of Interest |
CFHPM | Coarse-to-Fine Hyper-Prior Modeling for Learned Image Compression |
MROI-CFHPM | Multiscale Region of Interest Coarse-to-Fine Hyper-Prior Modeling for Learned Image Compression |
References
- Sundararaj, V.; Selvi, M. Opposition grasshopper optimizer based multimedia data distribution using user evaluation strategy. Multimed. Tools Appl. 2021, 80, 29875–29891. [Google Scholar] [CrossRef]
- Ferraz, O.; Subramaniyan, S.; Chinthalaa, R.; Andrade, J.; Cavallaro, J.R.; Nandy, S.K.; Silva, V.; Zhang, X.; Purnaprajna, M.; Falcao, G. A Survey on High-Throughput Non-Binary LDPC Decoders: ASIC, FPGA, and GPU Architectures. IEEE Commun. Surv. Tutor. 2021, 24, 524–556. [Google Scholar] [CrossRef]
- Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
- Brahimi, T.; Khelifi, F.; Kacha, A. An efficient JPEG-2000 based multimodal compression scheme. Multimed. Tools Appl. 2021, 80, 21241–21260. [Google Scholar] [CrossRef]
- Zhang, J.; Wang, H.; Wang, Y.; Zhou, Q.; Li, Y. Deep network based on up and down blocks using wavelet transform and successive multi-scale spatial attention for cloud detection. Remote Sens. Environ. 2021, 261, 112483. [Google Scholar] [CrossRef]
- Fink, P.D.; Holz, J.A.; Giudice, N.A. Fully autonomous vehicles for people with visual impairment: Policy, accessibility, and future directions. ACM Trans. Access. Comput. (TACCESS) 2021, 14, 1–17. [Google Scholar] [CrossRef]
- Wen, D.; Huang, X.; Bovolo, F.; Li, J.; Ke, X.; Zhang, A.; Benediktsson, J.A. Change detection from very-high-spatial-resolution optical remote sensing images: Methods, applications, and future directions. IEEE Geosci. Remote. Sens. Mag. 2021, 9, 68–101. [Google Scholar] [CrossRef]
- Poldrack, R.A. Region of interest analysis for fMRI. Soc. Cogn. Affect. Neurosci. 2007, 2, 67–70. [Google Scholar] [CrossRef]
- Ballé, J.; Minnen, D.; Singh, S.; Hwang, S.J.; Johnston, N. Variational image compression with a scale hyperprior. arXiv 2018, arXiv:1802.01436. [Google Scholar]
- Lee, J.; Cho, S.; Beack, S.K. Context-adaptive entropy model for end-to-end optimized image compression. arXiv 2018, arXiv:1809.10452. [Google Scholar]
- Ballé, J.; Laparra, V.; Simoncelli, E.P. End-to-end optimized image compression. arXiv 2016, arXiv:1611.01704. [Google Scholar]
- Minnen, D.; Ballé, J.; Toderici, G.D. Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural Inf. Process. Syst. 2018, 31, 1–10. [Google Scholar]
- Sklar, A. Random variables, joint distribution functions, and copulas. Kybernetika 1973, 9, 449–460. [Google Scholar]
- Long, M.; Wang, J.; Ding, G.; Sun, J.; Yu, P.S. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2200–2207. [Google Scholar]
- Hogan, J.W.; Laird, N.M. Mixture models for the joint distribution of repeated measures and event times. Stat. Med. 1997, 16, 239–257. [Google Scholar] [CrossRef]
- Mahmoud, M.; Edo, I.; Zadeh, A.H.; Awad, O.M.; Pekhimenko, G.; Albericio, J.; Moshovos, A. Tensordash: Exploiting sparsity to accelerate deep neural network training. In Proceedings of the 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Athens, Greece, 17–21 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 781–795. [Google Scholar]
- Jin, J.; Li, M.; Jin, L. Data normalization to accelerate training for linear neural net to predict tropical cyclone tracks. Math. Probl. Eng. 2015, 2015, 931629. [Google Scholar] [CrossRef] [Green Version]
- Li, F.; Zhang, H.; Liu, S.; Guo, J.; Ni, L.M.; Zhang, L. Dn-detr: Accelerate detr training by introducing query denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 13619–13627. [Google Scholar]
- Goodman, N.R. Statistical analysis based on a certain multivariate complex Gaussian distribution (an introduction). Ann. Math. Stat. 1963, 34, 152–177. [Google Scholar] [CrossRef]
- Dwivedi, R.K.; Kumar, R.; Buyya, R. Gaussian distribution-based machine learning scheme for anomaly detection in healthcare sensor cloud. Int. J. Cloud Appl. Comput. (IJCAC) 2021, 11, 52–72. [Google Scholar] [CrossRef]
- Mohammadi, B.; Shokrieh, M.; Jamali, M.; Mahmoudi, A.; Fazlali, B. Damage-entropy model for fatigue life evaluation of off-axis unidirectional composites. Compos. Struct. 2021, 270, 114100. [Google Scholar] [CrossRef]
- Feutrill, A.; Roughan, M. A Review of Shannon and Differential Entropy Rate Estimation. Entropy 2021, 23, 1046. [Google Scholar] [CrossRef]
- Schwarz, H.; Coban, M.; Karczewicz, M.; Chuang, T.D.; Bossen, F.; Alshin, A.; Lainema, J.; Helmrich, C.R.; Wiegand, T. Quantization and entropy coding in the versatile video coding (VVC) standard. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3891–3906. [Google Scholar] [CrossRef]
- Wyner, A.; Ziv, J. The rate-distortion function for source coding with side information at the decoder. IEEE Trans. Inf. Theory 1976, 22, 1–10. [Google Scholar] [CrossRef]
- Baik, S.; Choi, J.; Kim, H.; Cho, D.; Min, J.; Lee, K.M. Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 9465–9474. [Google Scholar]
- Yi-de, M.; Qing, L.; Zhi-Bai, Q. Automated image segmentation using improved PCNN model based on cross-entropy. In Proceedings of the 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, Hong Kong, China, 20–22 October 2004; IEEE: Piscataway, NJ, USA, 2004; pp. 743–746. [Google Scholar]
- Tomczak, J.; Welling, M. VAE with a VampPrior. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Las Palmas, Spain, 9–11 April 2018; PMLR: London, UK, 2018; pp. 1214–1223. [Google Scholar]
- Hu, Y.; Yang, W.; Liu, J. Coarse-to-fine hyper-prior modeling for learned image compression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11013–11020. [Google Scholar]
- Sze, V.; Budagavi, M. High throughput CABAC entropy coding in HEVC. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1778–1791. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Abandah, G.A.; Suyyagh, A.E.; Abdel-Majeed, M.R. Transfer learning and multi-phase training for accurate diacritization of arabic poetry. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 3744–3757. [Google Scholar] [CrossRef]
- Pritz, P.J.; Ma, L.; Leung, K.K. Jointly-Trained State-Action Embedding for Efficient Reinforcement Learning. In Proceedings of the ICLR 2021 Conference, Vienna, Austria, 3–7 May 2020. [Google Scholar]
- Lisi, M.A.; Peiyu, H.E.; Ao, C.U.I.; Weichuang, Y.U. Adaptive Beamforming Method Based on MISC Array in Non-uniform Noise. J. Signal Process. 2022, 38, 268–275. [Google Scholar]
- Solis, F.; Reyes, B.T.; Morero, D.A.; Hueda, M.R. Design and Experimental Verification of a Novel Error-Backpropagation-Based Background Calibration for Time Interleaved ADC in Digital Communication Receivers. arXiv 2022, arXiv:2204.04806. [Google Scholar]
- Qiu-Lin, L.I.; Jia-Feng, H.E. Vehicles Detection Based on Three-frame-difference Method and Cross-entropy Threshold Method. Comput. Eng. 2011, 37, 172–174. [Google Scholar]
- Ling, L.I. FFT-based coding algorithm with accurate rate control for space-borne SAR complex images. J. Remote Sens. 2009, 13, 1020–1029. [Google Scholar]
- Lian, Q.; Yan, W.; Zhang, X.; Chen, S. Single Image Rain Removal Using Image Decomposition and a Dense Network. IEEE/CAA J. Autom. Sin. 2019, 6, 141–150. [Google Scholar] [CrossRef]
- Lin, S.; Ji, R.; Chao, C.; Huang, F. ESPACE: Accelerating convolutional neural networks via eliminating spatial and channel redundancy. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
Code-Side Network | Operation | Output Tensor Size | Activation Function |
---|---|---|---|
Input | / | (b,h,w,c) | / |
1 | conv(3 × 3) | (b,h,w,2c) | Linear |
Downsampling | Space-to-Depth | (b,h/2,w/2,8c) | / |
2 | conv(1 × 1) | (b,h/2,w/2,4c) | Relu |
3 | conv(1 × 1) | (b,h/2,w/2,4c) | Relu |
3 | conv(1 × 1) | (b,h/2,w/2,c’) | Linear |
Code-Side Network | Operation | Output Tensor Size | Activation Function |
---|---|---|---|
Input | / | (b,h/2,w/2,c’) | / |
1 | conv(1 × 1) | (b,h/2,w/2,4c) | Linear |
Downsampling | Space-to-Depth | (b,h,w,c) | / |
2 | conv(1 × 1) | (b,h,w,4c) | Relu |
3 | conv(1 × 1) | (b,h,w,4c) | Relu |
3 | conv(3 × 3) | (b,h,w,c) | Linear |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Zhang, S.; Wang, H.; Li, Y.; Lu, R. Image Compression Network Structure Based on Multiscale Region of Interest Attention Network. Remote Sens. 2023, 15, 522. https://doi.org/10.3390/rs15020522
Zhang J, Zhang S, Wang H, Li Y, Lu R. Image Compression Network Structure Based on Multiscale Region of Interest Attention Network. Remote Sensing. 2023; 15(2):522. https://doi.org/10.3390/rs15020522
Chicago/Turabian StyleZhang, Jing, Shaobo Zhang, Hui Wang, Yunsong Li, and Ruitao Lu. 2023. "Image Compression Network Structure Based on Multiscale Region of Interest Attention Network" Remote Sensing 15, no. 2: 522. https://doi.org/10.3390/rs15020522
APA StyleZhang, J., Zhang, S., Wang, H., Li, Y., & Lu, R. (2023). Image Compression Network Structure Based on Multiscale Region of Interest Attention Network. Remote Sensing, 15(2), 522. https://doi.org/10.3390/rs15020522