Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference
Abstract
:1. Introduction
2. Related Works
2.1. Image Super-Resolution
2.2. Video Super-Resolution
3. Method
3.1. Overall Architecture
3.2. Fast Temporal Information Aggregation Module
3.3. Redundancy-Aware Inference
| Algorithm 1: Redundancy-Aware Inference Algorithm for the Proposed Model. | 
|  | 
4. Experiments
4.1. Dataset
4.2. Implement Details
4.3. Comparisons
4.4. Efficiency
4.5. Ablation Analysis
4.6. Limitation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kappeler, A.; Yoo, S.; Dai, Q.; Katsaggelos, A.K. Video Super-Resolution With Convolutional Neural Networks. IEEE Trans. Comput. Imaging 2016, 2, 109–122. [Google Scholar] [CrossRef]
- Rota, C.; Buzzelli, M.; Bianco, S.; Schettini, R. Video restoration based on deep learning: A comprehensive survey. Artif. Intell. Rev. 2023, 56, 5317–5364. [Google Scholar] [CrossRef]
- Farooq, M.; Dailey, M.N.; Mahmood, A.; Moonrinta, J.; Ekpanyapong, M. Human face super-resolution on poor quality surveillance video footage. Neural Comput. Appl. 2021, 33, 13505–13523. [Google Scholar] [CrossRef]
- Xiao, Y.; Su, X.; Yuan, Q.; Liu, D.; Shen, H.; Zhang, L. Satellite Video Super-Resolution via Multiscale Deformable Convolution Alignment and Temporal Grouping Projection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–19. [Google Scholar] [CrossRef]
- Anwar, S.; Khan, S.H.; Barnes, N. A Deep Journey into Super-resolution: A Survey. ACM Comput. Surv. 2020, 53, 60. [Google Scholar] [CrossRef]
- Jo, Y.; Oh, S.W.; Kang, J.; Kim, S.J. Deep Video Super-Resolution Network Using Dynamic Upsampling Filters without Explicit Motion Compensation. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018; Computer Vision Foundation/IEEE Computer Society: Washington, DC, USA, 2018; pp. 3224–3232. [Google Scholar] [CrossRef]
- Xue, T.; Chen, B.; Wu, J.; Wei, D.; Freeman, W.T. Video Enhancement with Task-Oriented Flow. Int. J. Comput. Vis. 2019, 127, 1106–1125. [Google Scholar] [CrossRef]
- Wang, X.; Chan, K.C.K.; Yu, K.; Dong, C.; Loy, C.C. EDVR: Video Restoration With Enhanced Deformable Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2019, Long Beach, CA, USA, 16–20 June 2019; Computer Vision Foundation/IEEE: Washington, DC, USA, 2019; pp. 1954–1963. [Google Scholar] [CrossRef]
- Choi, Y.J.; Lee, Y.; Kim, B. Wavelet Attention Embedding Networks for Video Super-Resolution. In Proceedings of the 25th International Conference on Pattern Recognition, ICPR 2020, Milan, Italy, 10–15 January 2021; IEEE: Washington, DC, USA, 2020; pp. 7314–7320. [Google Scholar] [CrossRef]
- Liang, J.; Fan, Y.; Xiang, X.; Ranjan, R.; Ilg, E.; Green, S.; Cao, J.; Zhang, K.; Timofte, R.; Gool, L.V. Recurrent Video Restoration Transformer with Guided Deformable Attention. In Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, New Orleans, LA, USA, 28 November–9 December 2022; pp. 378–393. [Google Scholar]
- Caballero, J.; Ledig, C.; Aitken, A.P.; Acosta, A.; Totz, J.; Wang, Z.; Shi, W. Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; IEEE Computer Society: Washington, DC, USA, 2017; pp. 2848–2857. [Google Scholar] [CrossRef]
- Chan, K.C.K.; Wang, X.; Yu, K.; Dong, C.; Loy, C.C. BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, 19–25 June 2021; Computer Vision Foundation/IEEE: Washington, DC, USA, 2021; pp. 4947–4956. [Google Scholar] [CrossRef]
- Bao, W.; Lai, W.; Zhang, X.; Gao, Z.; Yang, M. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 933–948. [Google Scholar] [CrossRef] [PubMed]
- Tao, X.; Gao, H.; Liao, R.; Wang, J.; Jia, J. Detail-Revealing Deep Video Super-Resolution. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017; IEEE Computer Society: Washington, DC, USA, 2017; pp. 4482–4490. [Google Scholar] [CrossRef]
- Yi, P.; Wang, Z.; Jiang, K.; Jiang, J.; Ma, J. Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: Washington, DC, USA, 2019; pp. 3106–3115. [Google Scholar] [CrossRef]
- Li, W.; Tao, X.; Guo, T.; Qi, L.; Lu, J.; Jia, J. MuCAN: Multi-correspondence Aggregation Network for Video Super-Resolution. In Proceedings of the Computer Vision—ECCV 2020—16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part X; Lecture Notes in Computer Science; Vedaldi, A., Bischof, H., Brox, T., Frahm, J., Eds.; Springer: Cham, Switzerland, 2020; Volume 12355, pp. 335–351. [Google Scholar] [CrossRef]
- Li, S.; He, F.; Du, B.; Zhang, L.; Xu, Y.; Tao, D. Fast Spatio-Temporal Residual Network for Video Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019; Computer Vision Foundation/IEEE: Washington, DC, USA, 2019; pp. 10522–10531. [Google Scholar] [CrossRef]
- Xia, B.; He, J.; Zhang, Y.; Wang, Y.; Tian, Y.; Yang, W.; Van Gool, L. Structured Sparsity Learning for Efficient Video Super-Resolution. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 22638–22647. [Google Scholar] [CrossRef]
- Lian, W.; Lian, W. Sliding Window Recurrent Network for Efficient Video Super-Resolution. In Proceedings of the Computer Vision—ECCV 2022 Workshops—Tel Aviv, Israel, 23–27 October 2022; Proceedings, Part II; Lecture Notes in Computer Science; Karlinsky, L., Michaeli, T., Nishino, K., Eds.; Springer: Cham, Switzerland, 2022; Volume 13802, pp. 591–601. [Google Scholar] [CrossRef]
- Cao, Y.; Wang, C.; Song, C.; Tang, Y.; Li, H. Real-Time Super-Resolution System of 4K-Video Based on Deep Learning. In Proceedings of the 32nd IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2021, Virtual Conference, 7–9 July 2021; IEEE: Washington, DC, USA, 2021; pp. 69–76. [Google Scholar] [CrossRef]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017; IEEE Computer Society: Washington, DC, USA, 2017; pp. 764–773. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. In Proceedings of the Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, 8–14 September 2018; Proceedings, Part VII; Lecture Notes in Computer Science; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer: Cham, Switzerland, 2018; Volume 11211, pp. 294–310. [Google Scholar] [CrossRef]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 295–307. [Google Scholar] [CrossRef] [PubMed]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2017, Honolulu, HI, USA, 21–26 July 2017; IEEE Computer Society: Washington, DC, USA, 2017; pp. 1132–1140. [Google Scholar] [CrossRef]
- Hui, Z.; Gao, X.; Yang, Y.; Wang, X. Lightweight Image Super-Resolution with Information Multi-distillation Network. In Proceedings of the Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, 21–25 October 2019; Amsaleg, L., Huet, B., Larson, M.A., Gravier, G., Hung, H., Ngo, C., Ooi, W.T., Eds.; ACM: New York, NY, USA, 2019; pp. 2024–2032. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, 3–7 May 2021. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021; IEEE: Washington, DC, USA, 2021; pp. 9992–10002. [Google Scholar] [CrossRef]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Gool, L.V.; Timofte, R. SwinIR: Image Restoration Using Swin Transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, BC, Canada, 11–17 October 2021; IEEE: Washington, DC, USA, 2021; pp. 1833–1844. [Google Scholar] [CrossRef]
- Wang, L.; Guo, Y.; Liu, L.; Lin, Z.; Deng, X.; An, W. Deep Video Super-Resolution Using HR Optical Flow Estimation. IEEE Trans. Image Process. 2020, 29, 4323–4336. [Google Scholar] [CrossRef] [PubMed]
- Kim, S.Y.; Lim, J.; Na, T.; Kim, M. Video Super-Resolution Based on 3D-CNNS with Consideration of Scene Change. In Proceedings of the 2019 IEEE International Conference on Image Processing, ICIP 2019, Taipei, Taiwan, 22–25 September 2019; IEEE: Washington, DC, USA, 2019; pp. 2831–2835. [Google Scholar] [CrossRef]
- Isobe, T.; Li, S.; Jia, X.; Yuan, S.; Slabaugh, G.G.; Xu, C.; Li, Y.; Wang, S.; Tian, Q. Video Super-Resolution With Temporal Group Attention. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020; Computer Vision Foundation/IEEE: Washington, DC, USA, 2020; pp. 8005–8014. [Google Scholar] [CrossRef]
- Tian, Y.; Zhang, Y.; Fu, Y.; Xu, C. TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020; Computer Vision Foundation/IEEE: Washington, DC, USA, 2020; pp. 3357–3366. [Google Scholar] [CrossRef]
- Ying, X.; Wang, L.; Wang, Y.; Sheng, W.; An, W.; Guo, Y. Deformable 3D Convolution for Video Super-Resolution. IEEE Signal Process. Lett. 2020, 27, 1500–1504. [Google Scholar] [CrossRef]
- Xiao, Y.; Yuan, Q.; Jiang, K.; Jin, X.; He, J.; Zhang, L.; Lin, C. Local-Global Temporal Difference Learning for Satellite Video Super-Resolution. arXiv 2023, arXiv:2304.04421. [Google Scholar] [CrossRef]
- Wang, H.; Xiang, X.; Tian, Y.; Yang, W.; Liao, Q. STDAN: Deformable Attention Network for Space-Time Video Super-Resolution. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Xiao, Y.; Yuan, Q.; Zhang, Q.; Zhang, L. Deep Blind Super-Resolution for Satellite Video. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5516316. [Google Scholar] [CrossRef]
- Shi, S.; Gu, J.; Xie, L.; Wang, X.; Yang, Y.; Dong, C. Rethinking Alignment in Video Super-Resolution Transformers. In Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, New Orleans, LA, USA, 28 November–9 December 2022; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; pp. 36081–36093. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations—ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Agustsson, E.; Timofte, R. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In Proceedings of the The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Zhu, X.; Li, Z.; Zhang, X.Y.; Li, C.; Liu, Y.; Xue, Z. Residual Invertible Spatio-Temporal Network for Video Super-Resolution. Proc. AAAI Conf. Artif. Intell. 2019, 33, 5981–5988. [Google Scholar] [CrossRef]










| Layer No. | Input Layer No. | Layer | Input Channels | Output Channels | 
|---|---|---|---|---|
| 0 | Input | |||
| 1 | 0 | Concatenation | ||
| 2 | 1 | Channel Attention | ||
| 3 | 1 and 2 | Elementwise Add | ||
| 4 | 3 | Convolution | ||
| 5 | 4 | RCAB | ||
| 6 | 5 | LeakyReLU | ||
| 7 | 6 | Channel Attention | ||
| 8 | 6 and 7 | Elementwise Add | ||
| 9 | 8 | Convolution | ||
| 10 | 9 | RCAB | ||
| 11 | 10 | LeakyReLU | ||
| 12 | 11 | Output | 
| Method | Calendar | City | Foliage | Walk | Average | 
|---|---|---|---|---|---|
| IMDN [25] | 22.1185 | 25.9733 | 24.6737 | 28.4131 | 25.2947 | 
| 0.7078 | 0.6811 | 0.6564 | 0.8693 | 0.7287 | |
| SWRN [19] | 21.7028 | 25.8976 | 24.4783 | 27.9725 | 25.0128 | 
| 0.6749 | 0.6731 | 0.6458 | 0.8585 | 0.7131 | |
| 3DSRnet [31] | 22.5174 | 27.1086 | 25.5571 | 27.7540 | 25.7433 | 
| 0.6586 | 0.6987 | 0.6898 | 0.8681 | 0.7288 | |
| TOF [7] | 22.4371 | 26.6647 | 25.3451 | 28.9459 | 25.8482 | 
| 0.7242 | 0.7356 | 0.707 | 0.8799 | 0.7617 | |
| EGVSR [20] | 23.5585 | 27.4242 | 24.7348 | 27.9476 | 25.9163 | 
| 0.7959 | 0.8009 | 0.7091 | 0.8599 | 0.7915 | |
| SOFVSR [30] | 22.7644 | 26.8175 | 25.5315 | 29.1136 | 26.0568 | 
| 0.7463 | 0.7495 | 0.7182 | 0.8835 | 0.7744 | |
| RISTN [42] | 22.9171 | 26.9975 | 25.5761 | 29.2238 | 26.1786 | 
| 0.7504 | 0.7582 | 0.7221 | 0.8814 | 0.7780 | |
| Ours | 23.0747 | 27.0412 | 25.6840 | 29.6298 | 26.3574 | 
| 0.7626 | 0.7642 | 0.7255 | 0.8937 | 0.7865 | |
| Ours (RAI) | 23.0682 | 27.0244 | 25.6730 | 29.6147 | 26.3451 | 
| 0.7621 | 0.7627 | 0.7248 | 0.8935 | 0.7858 | 
| Method | IMDN [25] | SWRN [19] | TOF [7] | EGVSR [20] | SOFVSR [30] | RISTN [42] | Ours | Ours (RAI) | 
|---|---|---|---|---|---|---|---|---|
| AMVTG_004 | 26.0314 | 24.9373 | 24.8697 | 22.7946 | 25.1903 | 25.6177 | 26.4554 | 26.4566 | 
| 0.7179 | 0.6398 | 0.6442 | 0.5583 | 0.6643 | 0.6820 | 0.7456 | 0.7456 | |
| HKVTG_004 | 28.5451 | 28.2554 | 28.4698 | 25.3357 | 28.6597 | 28.6793 | 28.8824 | 28.8816 | 
| 0.7519 | 0.7391 | 0.7483 | 0.6234 | 0.7596 | 0.7597 | 0.7694 | 0.7693 | |
| LDVTG_009 | 26.8267 | 25.7857 | 26.4415 | 27.1797 | 27.0993 | 27.6783 | 27.7354 | 27.7327 | 
| 0.8353 | 0.8052 | 0.8345 | 0.8551 | 0.8501 | 0.8588 | 0.8653 | 0.8653 | |
| LDVTG_022 | 29.7429 | 29.2831 | 29.3073 | 27.1004 | 29.4682 | 29.8795 | 30.1291 | 30.1330 | 
| 0.8496 | 0.8352 | 0.8392 | 0.7822 | 0.8432 | 0.8502 | 0.8604 | 0.8605 | |
| NYVTG_006 | 29.4499 | 29.2991 | 30.2000 | 25.9653 | 30.9459 | 30.6516 | 30.9080 | 30.9066 | 
| 0.8564 | 0.8452 | 0.8599 | 0.7647 | 0.8785 | 0.8723 | 0.8819 | 0.8818 | |
| PRVTG_008 | 23.8790 | 23.4167 | 23.8181 | 21.1342 | 24.1036 | 24.2942 | 24.5739 | 24.5743 | 
| 0.6764 | 0.6534 | 0.6741 | 0.5533 | 0.6959 | 0.7067 | 0.7167 | 0.7166 | |
| PRVTG_012 | 26.4371 | 26.2694 | 26.5504 | 24.5646 | 26.7048 | 26.8525 | 26.9413 | 26.9435 | 
| 0.7718 | 0.7628 | 0.7750 | 0.7129 | 0.7839 | 0.7899 | 0.7945 | 0.7945 | |
| RMVTG_011 | 25.8581 | 25.4017 | 25.9722 | 23.8420 | 26.2772 | 26.5048 | 26.6950 | 26.6929 | 
| 0.7458 | 0.7254 | 0.7488 | 0.6703 | 0.7642 | 0.7711 | 0.7797 | 0.7795 | |
| RMVTG_024 | 25.2832 | 24.9563 | 25.3642 | 23.5458 | 25.7016 | 25.8718 | 25.9665 | 25.9703 | 
| 0.6664 | 0.6488 | 0.6720 | 0.6162 | 0.6972 | 0.7103 | 0.7109 | 0.7109 | |
| TPVTG_003 | 30.3131 | 29.7610 | 29.9361 | 27.3018 | 29.9714 | 30.3205 | 30.7238 | 30.7237 | 
| 0.8815 | 0.8687 | 0.8725 | 0.7860 | 0.8753 | 0.8773 | 0.8898 | 0.8898 | |
| cact1_001 | 32.2501 | 31.1136 | 32.1522 | 31.3228 | 32.3432 | 32.5230 | 33.3811 | 33.3668 | 
| 0.9075 | 0.8904 | 0.9196 | 0.9111 | 0.9214 | 0.9178 | 0.9284 | 0.9281 | |
| car05_001 | 29.5382 | 29.1795 | 30.0287 | 28.9618 | 30.1839 | 29.8935 | 30.0922 | 30.0779 | 
| 0.8423 | 0.8338 | 0.8620 | 0.8309 | 0.8637 | 0.8423 | 0.8537 | 0.8532 | |
| gree3_001 | 29.9863 | 29.6108 | 29.6615 | 26.9135 | 29.9342 | 30.1444 | 30.3567 | 30.3561 | 
| 0.8119 | 0.8030 | 0.8102 | 0.7017 | 0.8180 | 0.8163 | 0.8249 | 0.8248 | |
| hdclub_001 | 23.8864 | 23.4446 | 23.8494 | 22.8246 | 24.0650 | 24.5185 | 24.8958 | 24.9024 | 
| 0.7387 | 0.7148 | 0.7484 | 0.7338 | 0.7603 | 0.7756 | 0.7869 | 0.7870 | |
| hdclub_003 | 20.3518 | 20.1453 | 20.8639 | 19.3091 | 21.0255 | 21.2868 | 21.1774 | 21.1785 | 
| 0.6015 | 0.5881 | 0.6535 | 0.6207 | 0.6679 | 0.6896 | 0.6751 | 0.6752 | |
| hdclub_008 | 25.9392 | 25.7197 | 26.1188 | 24.6306 | 26.2198 | 26.2789 | 26.4213 | 26.4210 | 
| 0.7184 | 0.7045 | 0.7312 | 0.6647 | 0.7369 | 0.7381 | 0.7473 | 0.7473 | |
| hitachi_isee5 | 23.1258 | 21.9516 | 22.9783 | 24.6103 | 23.3418 | 24.1477 | 24.3415 | 24.3261 | 
| 0.8079 | 0.7568 | 0.8055 | 0.8655 | 0.8187 | 0.8358 | 0.8470 | 0.8464 | |
| hk001_001 | 29.6545 | 29.1030 | 29.5680 | 26.7095 | 29.8456 | 30.1093 | 30.4816 | 30.4851 | 
| 0.7580 | 0.7459 | 0.7758 | 0.6702 | 0.7845 | 0.7868 | 0.7949 | 0.7948 | |
| hk004_006 | 30.8958 | 30.1270 | 30.5283 | 28.1557 | 31.0959 | 31.1093 | 31.7502 | 31.7509 | 
| 0.8463 | 0.8324 | 0.8559 | 0.8003 | 0.8634 | 0.8616 | 0.8719 | 0.8719 | |
| indi1_004 | 33.0746 | 32.2322 | 32.8443 | 31.7546 | 33.0945 | 33.4130 | 34.2882 | 34.2895 | 
| 0.8891 | 0.8709 | 0.8913 | 0.8870 | 0.8976 | 0.8988 | 0.9145 | 0.9145 | |
| indi1_032 | 34.6416 | 33.4208 | 34.4735 | 33.4178 | 34.5877 | 34.8725 | 36.2915 | 36.2836 | 
| 0.9246 | 0.9077 | 0.9315 | 0.9201 | 0.9352 | 0.9303 | 0.9469 | 0.9467 | |
| jvc_004 | 29.8523 | 28.7227 | 30.0097 | 31.6536 | 29.9930 | 30.6357 | 31.0591 | 31.0441 | 
| 0.9494 | 0.9329 | 0.9513 | 0.9632 | 0.9502 | 0.9531 | 0.9612 | 0.9611 | |
| jvc_009 | 27.7313 | 27.1172 | 27.7947 | 26.8302 | 28.0539 | 28.0865 | 28.6781 | 28.6749 | 
| 0.8500 | 0.8263 | 0.8517 | 0.8462 | 0.8606 | 0.8571 | 0.8758 | 0.8756 | |
| land5_001 | 37.0995 | 35.7070 | 36.0325 | 34.8101 | 35.9214 | 36.7371 | 38.2512 | 38.2421 | 
| 0.9602 | 0.9548 | 0.9637 | 0.9573 | 0.9647 | 0.9578 | 0.9681 | 0.9680 | |
| land9_007 | 34.9431 | 33.9314 | 34.4512 | 32.2815 | 34.7169 | 35.0958 | 36.1717 | 36.1766 | 
| 0.9161 | 0.9094 | 0.9288 | 0.8839 | 0.9297 | 0.9264 | 0.9360 | 0.9361 | |
| philips_hkc01 | 35.6114 | 33.9995 | 34.2302 | 34.6706 | 34.9984 | 34.7056 | 36.2910 | 36.2628 | 
| 0.9373 | 0.9104 | 0.9125 | 0.9254 | 0.9274 | 0.9193 | 0.9429 | 0.9426 | |
| philips_hkc04 | 34.2978 | 33.5183 | 34.1432 | 31.0672 | 32.6543 | 32.8824 | 34.2802 | 34.2844 | 
| 0.8927 | 0.8797 | 0.8917 | 0.8267 | 0.8712 | 0.8624 | 0.8879 | 0.8880 | |
| philips_hkc05 | 30.3882 | 29.3879 | 30.8576 | 32.1182 | 30.8332 | 30.7311 | 31.2771 | 31.2218 | 
| 0.8448 | 0.8052 | 0.8576 | 0.9017 | 0.8587 | 0.8541 | 0.8711 | 0.8692 | |
| philips_hkc11 | 36.4101 | 35.2255 | 35.7448 | 32.6174 | 35.6057 | 35.5644 | 36.8548 | 36.8287 | 
| 0.8961 | 0.8713 | 0.8846 | 0.8316 | 0.8850 | 0.8810 | 0.9042 | 0.9036 | |
| veni3_011 | 33.1165 | 32.2105 | 32.7536 | 29.8624 | 33.1524 | 33.1738 | 34.5723 | 34.5763 | 
| 0.9562 | 0.9444 | 0.9527 | 0.9047 | 0.9571 | 0.9524 | 0.9645 | 0.9646 | |
| Average | 29.5054 | 28.7745 | 29.3338 | 27.7762 | 29.5263 | 29.7505 | 30.3308 | 30.3255 | 
| 0.8267 | 0.8069 | 0.8283 | 0.7856 | 0.8361 | 0.8378 | 0.8506 | 0.8504 | 
| Method | Parameters | FLOPs | Latency (ms) | Real-Time Inference | PSNR on Vid4 | PSNR on SPMCs-30 | 
|---|---|---|---|---|---|---|
| IMDN [25] | 715K | 40.91G | 12.50 | ✓ | 25.2947 | 29.5054 | 
| SWRN [19] | 43K | 5.00G | 9.50 | ✓ | 25.0128 | 28.7744 | 
| TOF [7] | 1405K | 133.06G | 545.77 | ✕ | 25.8482 | 29.3338 | 
| EGVSR [20] | 2587K | 102.89G | 14.17 | ✓ | 25.9163 | 27.7762 | 
| SOFVSR [30] | 1048K | 120.83G | 128.36 | ✕ | 26.0568 | 29.5263 | 
| Ours | 1895K | 336.40G | 101.47 | ✕ | 26.3574 | 30.3308 | 
| Ours (RAI) | 1895K | 109.07G | 39.51 | ✓ | 26.3451 | 30.3255 | 
| Model | Spatial Aggregation | Temporal Aggregation | Pre-Trained Parameters | Vid4 | SPMCs-30 | ||
|---|---|---|---|---|---|---|---|
| PSNR | SSIM | PSNR | SSIM | ||||
| Model 1 | No | No | No | 25.3254 | 0.7249 | 29.3972 | 0.8216 | 
| Model 2 | Yes | No | No | 25.9753 | 0.7645 | 29.6212 | 0.8317 | 
| Model 3 | Yes | Yes | No | 26.2903 | 0.7808 | 30.0463 | 0.8429 | 
| Model 4 | No | No | Yes | 25.4421 | 0.7318 | 29.6574 | 0.8288 | 
| Model 5 | Yes | Yes | Yes | 26.3574 | 0.7865 | 30.3308 | 0.8506 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, W.; Liu, Z.; Lu, H.; Lan, R.; Zhang, Z. Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference. Sensors 2023, 23, 7880. https://doi.org/10.3390/s23187880
Wang W, Liu Z, Lu H, Lan R, Zhang Z. Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference. Sensors. 2023; 23(18):7880. https://doi.org/10.3390/s23187880
Chicago/Turabian StyleWang, Wenhao, Zhenbing Liu, Haoxiang Lu, Rushi Lan, and Zhaoyuan Zhang. 2023. "Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference" Sensors 23, no. 18: 7880. https://doi.org/10.3390/s23187880
APA StyleWang, W., Liu, Z., Lu, H., Lan, R., & Zhang, Z. (2023). Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference. Sensors, 23(18), 7880. https://doi.org/10.3390/s23187880
 
        

 
       