Video Super-Resolution Network with Gated High-Low Resolution Frames
Abstract
:1. Introduction
- (1)
- A new gated motion compensation approach for high-low resolution frames is proposed, adaptively selecting useful information in high- and low-resolution neighboring frames to avoid the effect of motion estimation errors;
- (2)
- A pre-initial hidden state network is designed to reduce the impact of video frame imbalance that occurs by adopting a one-way cyclical frame;
- (3)
- A local scale hierarchical salient feature fusion network is designed, which can focus on different scales as well as different regional features to obtain locally salient feature information;
- (4)
- A plug-and-play hierarchical hybrid attention module is proposed, which can filter useful information from different levels of features and recover better localized high-frequency detail information.
2. Related Works
3. The Gated High-Low Resolution Frames Network
3.1. Overall Structure
3.2. Pre-Initial Hidden State Network
3.3. Gate-Guided Adjacent High-Low Resolution Frame Alignment Network
- 1.
- Feature extraction module
- 2.
- Optical flow estimation module Spynet
- 3.
- Gated adjacent high-low resolution frame motion compensation module
3.4. Scale Local Level Significant Feature Fusion Network
3.5. Reconstruction Network
4. Experimental Results
4.1. Implementation Details
4.2. Ablation Study
- Ablation Experiment I
- Ablation Experiment II
- Ablation Experiment III
4.3. Comparisons with State-of-the-Arts
4.3.1. Comparison of Experimental Results on the Vid4 Test Set
4.3.2. Comparison of Experimental Results on the SPMCS-11 Test Set
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Caballero, J.; Ledig, C.; Aitken, A.; Acosta, A.; Totz, J.; Wang, Z.; Shi, W. Real-time video super-resolution with spatio-temporal networks and motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Chan, K.C.; Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Basicvsr: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Sajjadi, M.S.; Vemulapalli, R.; Brown, M. Frame-recurrent video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Xue, T.; Chen, B.; Wu, J.; Wei, D.; Freeman, W.T. Video enhancement with task-oriented flow. Int. J. Comput. Vis. 2019, 127, 1106–1125. [Google Scholar] [CrossRef] [Green Version]
- Wang, L.; Guo, Y.; Lin, Z.; Deng, X.; An, W. Learning for video super-resolution through HR optical flow estimation. In Proceedings of the Computer Vision—ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, 2–6 December 2018; Revised Selected Papers, Part I. Springer: Berlin/Heidelberg, Germany, 2019; Volume 14. [Google Scholar]
- Wang, L.; Guo, Y.; Liu, L.; Lin, Z.; Deng, X.; An, W. Deep video super-resolution using HR optical flow estimation. IEEE Trans. Image Process. 2020, 29, 4323–4336. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, H.; Xu, J.; Hou, S. Optical flow enhancement and effect research in action recognition. In Proceedings of the 2021 IEEE 13th International Conference on Computer Research and Development (ICCRD), Beijing, China, 5–7 January 2021. [Google Scholar]
- Tao, X.; Gao, H.; Liao, R.; Wang, J.; Jia, J. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Haris, M.; Shakhnarovich, G.; Ukita, N. Recurrent back-projection network for video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Chan, K.C.; Zhou, S.; Xu, X.; Loy, C.C. BasicVSR++: Improving video super-resolution with enhanced propagation and alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Liang, J.; Fan, Y.; Xiang, X.; Ranjan, R.; Ilg, E.; Green, S.; Cao, J.; Zhang, K.; Van Gool, L. Recurrent video restoration transformer with guided deformable attention. Adv. Neural Inf. Process. Syst. 2022, 35, 378–393. [Google Scholar]
- Wang, P.; Sertel, E. Multi-frame super-resolution of remote sensing images using attention-based GAN models. Knowl. -Based Syst. 2023, 266, 110387. [Google Scholar] [CrossRef]
- Chiche, B.N.; Woiselle, A.; Frontera-Pons, J.; Starck, J.-L. Stable long-term recurrent video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Fuoli, D.; Gu, S.; Timofte, R. Efficient video super-resolution through recurrent latent space propagation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–29 October 2019. [Google Scholar]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chasemi-Falavarjani, N.; Moallem, P.; Rahimi, A. Particle filter based multi-frame image super resolution. Signal Image Video Process. 2022, 6, 1–8. [Google Scholar]
- Wang, X.; Chan, K.C.; Yu, K.; Dong, C.; Change Loy, C. Edvr: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Ranjan, A.; Black, M.J. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Li, F.; Bai, H.; Zhao, Y. Learning a deep dual attention network for video super-resolution. IEEE Trans. Image Process. 2020, 29, 4474–4488. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Yi, P.; Jiang, K.; Jiang, J.; Han, Z.; Lu, T.; Ma, J.; Yi, H. Multi-memory convolutional neural network for video super-resolution. IEEE Trans. Image Process. 2018, 28, 2530–2544. [Google Scholar] [CrossRef] [PubMed]
- Isobe, T.; Zhu, F.; Jia, X.; Wang, S. Revisiting temporal modeling for video super-resolution. arXiv 2020, arXiv:200805765. [Google Scholar]
- Jo, Y.; Oh, S.W.; Kang, J.; Kim, S.J. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Khattab, M.M.; Zeki, A.M.; Alwan, A.A.; Badawy, A.S. Regularization-based multi-frame super-resolution: A systematic review. J. King Saud Univ. -Comput. Inf. Sci. 2020, 32, 755–762. [Google Scholar] [CrossRef]
- Yi, P.; Wang, Z.; Jiang, K.; Jiang, J.; Ma, J. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27–29 October 2019. [Google Scholar]
- Yi, P.; Wang, Z.; Jiang, K.; Jiang, J.; Lu, T.; Tian, X.; Ma, J. Omniscient video super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada,, 10–17 October 2021. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 10–14 September 2018. [Google Scholar]
- Liu, C.; Sun, D. On Bayesian adaptive video super resolution. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 36, 346–360. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Isobe, T.; Li, S.; Jia, X.; Yuan, S.; Slabaugh, G.; Xu, C.; Ma, Y. Video super-resolution with temporal group attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
IHSNet | GAFANet | SLLFNet | PSNR/SSIM |
---|---|---|---|
√ | √ | 27.42/0.8381 | |
√ | √ | 27.25/0.8336 | |
√ | √ | 26.25/0.7886 | |
√ | √ | √ | 27.47/0.8372 |
GHLRF-w/o | PSNR/SSIM |
---|---|
GHLRF-o | 27.28/0.8346 |
GHLRF-w | 27.47/0.8372 |
LH/L/HRA | PSNR/SSIM |
---|---|
HRA | 27.40/0.8375 |
LRA | 27.36/0.8368 |
LHRA | 27.47/0.8372 |
Method | Params (M) | Runtimes (ms) | Calendar | City | Foliage | Walk | Average |
---|---|---|---|---|---|---|---|
Bicubic | - | - | 18.80/0.4794 | 23.78/0.5129 | 21.39/0.4227 | 22.88/0.6976 | 21.71/0.5280 |
FRVSR | 5.1 | 137 | 23.46/0.7854 | 27.70/0.8099 | 25.96/0.7560 | 29.69/0.8990 | 26.70/0.8126 |
DUF | 5.8 | 974 | 24.04/0.8110 | 28.27/0.8313 | 26.41/0.7709 | 30.30/0.9141 | 27.33/0.8318 |
PFNL | 3.0 | 295 | - | - | - | - | 27.16/0.8355 |
RBPN | 12.2 | 1507 | 23.95/0.8070 | 27.70/0.8036 | 26.22/0.7575 | 30.69/0.9104 | 27.14/0.8196 |
RLSP | 4.2 | 49 | 24.60/0.8355 | 28.14/0.8453 | 26.75/0.7983 | 30.88/0.9192 | 27.60/0.8476 |
TGA | 5.8 | 441 | 24.50/0.8290 | 28.50/0.8420 | 26.59/0.7793 | 30.95/0.9171 | 27.63/0.8419 |
RRN | 3.4 | 45 | 24.57/0.8342 | 28.51/0.8467 | 26.94/0.7979 | 30.74/0.9164 | 27.69/0.8488 |
BasicVSR | 6.3 | 63 | - | - | - | - | 27.96/0.8553 |
GHLRF | 20.4 | 110 | 25.02/0.8482 | 30.09/0.8822 | 26.95/0.7961 | 31.34/0.9253 | 28.60/0.8630 |
Clip Name | Bicubic | RBPN | RRN | TGA | GHLRF |
---|---|---|---|---|---|
car05_001 | 24.94/0.6799 | 31.92/0.9016 | 31.79/0.9048 | 31.84/0.8987 | 32.70/0.9192 |
hdclub_003_001 | 17.54/0.9186 | 21.89/0.7246 | 22.42/0.7633 | 22.31/0.7537 | 22.74/0.7764 |
hitachi_isee5_001 | 17.44/0.4283 | 26.25/0.9042 | 26.45/0.9058 | 26.47/0.9059 | 27.88/0.9327 |
hk004_001 | 25.93/0.7221 | 33.34/0.9010 | 33.48/0.9138 | 33.76/0.9136 | 34.23/0.9213 |
HKVTG_004 | 25.29/0.8765 | 29.50/0.7975 | 29.70/0.8090 | 29.68/0.8088 | 29.94/0.8187 |
jvc_009_001 | 23.15/0.6713 | 29.99/0.9093 | 29.52/0.9038 | 30.19/0.9140 | 31.27/0.9326 |
NYVTG_006 | 25.69/0.7291 | 33.18/0.9227 | 33.02/0.9222 | 33.75/0.9334 | 34.00/0.9377 |
PYVTG_012 | 23.74/0.6249 | 27.56/0.8231 | 27.98/0.8403 | 27.80/0.8350 | 27.94/0.8434 |
RMVTG_011 | 20.98/0.5137 | 27.59/0.8157 | 28.20/0.8351 | 28.35/0.8409 | 28.45/0.8461 |
veni3_011 | 25.73/0.8312 | 36.58/0.9735 | 34.17/0.9695 | 36.40/0.9740 | 37.15/0.9774 |
Veni5_015 | 24.59/0.7823 | 32.92/0.9443 | 31.50/0.9411 | 33.24/0.9486 | 33.54/0.9517 |
Average | 23.18/0.6253 | 30.06/0.8743 | 29.84/0.8827 | 30.34/0.8842 | 30.90/0.9861 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ouyang, N.; Ou, Z.; Lin, L. Video Super-Resolution Network with Gated High-Low Resolution Frames. Appl. Sci. 2023, 13, 8299. https://doi.org/10.3390/app13148299
Ouyang N, Ou Z, Lin L. Video Super-Resolution Network with Gated High-Low Resolution Frames. Applied Sciences. 2023; 13(14):8299. https://doi.org/10.3390/app13148299
Chicago/Turabian StyleOuyang, Ning, Zhishan Ou, and Leping Lin. 2023. "Video Super-Resolution Network with Gated High-Low Resolution Frames" Applied Sciences 13, no. 14: 8299. https://doi.org/10.3390/app13148299
APA StyleOuyang, N., Ou, Z., & Lin, L. (2023). Video Super-Resolution Network with Gated High-Low Resolution Frames. Applied Sciences, 13(14), 8299. https://doi.org/10.3390/app13148299