Preprocessing for Multi-Dimensional Enhancement and Reconstruction in Neural Video Compression
Abstract
:1. Introduction
- (1)
- We propose a Multi-Dimensional Enhancement and Reconstruction (MDER) preprocessing method for video coding which effectively removes encoding noise and enhances details to reconstruct frames for video coding.
- (2)
- Dense Blocks are integrated to further maximize the utilization of redundant information, which can process and utilize the information more effectively and enhance the efficiency of residual information processing.
- (3)
- Our proposed method can improve coding efficiency while maintaining consistent visual quality and is more easily deployed on devices with limited processing power.
2. Materials and Methods
2.1. Multi-Dimensional Enhancement and Reconstruction (MDER)
2.2. Dense Feature-Enhanced Video Compression (DFVC)
2.3. Preprocessing Module for Neural Video Encoding
3. Experiments and Results
3.1. Datasets
3.2. Implementation Details
3.3. Evaluation Methods
4. Results and Discussion
4.1. Results
4.2. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Shang, X.; Wang, G.; Liang, J. Color-Sensitivity-Based Rate-Distortion Optimization for H.265/HEVC. IEEE Trans. Circuits Syst. Video Technol. (TCSVT) 2022, 32, 802–812. [Google Scholar] [CrossRef]
- Alexandre, D.; Hang, H.M.; Peng, W.H.; Domański, M. Deep Video Compression for Interframe Coding. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 2124–2128. [Google Scholar]
- Shang, X.; Li, G.; Zhao, X. Low complexity inter coding scheme for Versatile Video Coding (VVC). J. Vis. Commun. Image Represent. 2023, 90, 103683. [Google Scholar] [CrossRef]
- Tsai, Y.-H.; Liu, M.-Y.; Sun, D.; Yang, M.-H.; Kautz, J. Learning binary residual representations for domain-specific video streaming. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; p. 32. [Google Scholar]
- Lu, G.; Ouyang, W.; Xu, D.; Zhang, X.; Cai, C.; Gao, Z. Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 11006–11015. [Google Scholar]
- Lin, J.; Liu, D.; Li, H.; Wu, F. M-LVC: Multiple Frames Prediction for Learned Video Compression. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 3543–3551. [Google Scholar]
- Li, J.; Li, B. Deep contextual video compression. Adv. Neural Inf. Process. Syst. 2021, 34, 18114–18125. [Google Scholar]
- Li, J.; Li, B.; Lu, Y. Hybrid spatial-temporal entropy modelling for neural video compression. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022; pp. 1503–1511. [Google Scholar]
- Li, J.; Li, B.; Lu, Y. Neural video compression with diverse contexts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 22616–22626. [Google Scholar]
- Hu, Z.; Lu, G.; Xu, D. FVC: A New Framework towards Deep Video Compression in Feature Space. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 1502–1511. [Google Scholar]
- Samarathunga, P.; Ganearachchi, Y.; Fernando, T.; Jayasingam, A.; Alahapperuma, I.; Fernando, A. A Semantic Communication and VVC Based Hybrid Video Coding System. IEEE Access 2024, 12, 79202–79224. [Google Scholar] [CrossRef]
- Talebi, H.; Kelly, D.; Luo, X.; Dorado, I.G.; Yang, F.; Milanfar, P.; Elad, M. Better Compression with Deep Pre-Editing. IEEE Trans. Image Process. 2021, 30, 6673–6685. [Google Scholar] [CrossRef] [PubMed]
- Chadha, A.; Andreopoulos, Y. Deep Perceptual Preprocessing for Video Coding. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 14847–14856. [Google Scholar]
- Ma, C.; Wu, Z. Rate-perception optimized preprocessing for video coding. arXiv 2023, arXiv:2301.10455. [Google Scholar]
- Huang, G.; Liu, Z.; Van, L. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
- Liu, J.; Tang, J.; Wu, G. Residual feature distillation network for lightweight image super-resolution. In Proceedings of the Computer Vision–ECCV 2020 Workshops, Glasgow, UK, 23–28 August 2020; Part III 16. Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 41–55. [Google Scholar]
- Li, J.; Wen, Y.; He, L. SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 6153–6162. [Google Scholar]
- Vu, T.; Nguyen, C.V.; Pham, T.X. Fast and Efficient Image Quality Enhancement via Desubpixel Convolutional Neural Networks. In Proceedings of the Computer Vision—ECCV 2018 Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Kong, F.; Li, M.; Liu, S.; Liu, D.; He, J.; Bai, Y.; Chen, F.; Fu, L. Residual Local Feature Network for Efficient Super-Resolution. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 765–775. [Google Scholar]
- Shi, W.; Caballero, J.; Huszár, F. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
- Xue, T.; Chen, B.; Wu, J. Video enhancement with task-oriented flow. Int. J. Comput. Vis. 2019, 127, 1106–1125. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 12. [Google Scholar]
Attribute | MDER | DPP | RPP |
---|---|---|---|
Model Complexity | Low to Moderate | High | Moderate |
Total Processing Time | 45.41 s | 59.01 s | 56.25 s |
Average Processing Time per Frame | 0.0086 s | 0.0092 s | 0.0079 s |
Average FPS | 49 | 45 | 48 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Zhang, Q.; Zhao, H.; Wang, G.; Shang, X. Preprocessing for Multi-Dimensional Enhancement and Reconstruction in Neural Video Compression. Appl. Sci. 2024, 14, 8626. https://doi.org/10.3390/app14198626
Wang J, Zhang Q, Zhao H, Wang G, Shang X. Preprocessing for Multi-Dimensional Enhancement and Reconstruction in Neural Video Compression. Applied Sciences. 2024; 14(19):8626. https://doi.org/10.3390/app14198626
Chicago/Turabian StyleWang, Jiajia, Qi Zhang, Haiwu Zhao, Guozhong Wang, and Xiwu Shang. 2024. "Preprocessing for Multi-Dimensional Enhancement and Reconstruction in Neural Video Compression" Applied Sciences 14, no. 19: 8626. https://doi.org/10.3390/app14198626
APA StyleWang, J., Zhang, Q., Zhao, H., Wang, G., & Shang, X. (2024). Preprocessing for Multi-Dimensional Enhancement and Reconstruction in Neural Video Compression. Applied Sciences, 14(19), 8626. https://doi.org/10.3390/app14198626