An Omnidirectional Image Super-Resolution Method Based on Enhanced SwinIR
Abstract
:1. Introduction
2. Related Work
2.1. Single-Image Super-Resolution (SISR)
2.2. Omnidirectional Image Super-Resolution (ODISR)
3. Architectural Details
3.1. The Entire Network Architecture
3.2. The Location Transformation Module
3.2.1. Positioning Network
3.2.2. Grid Generator
3.2.3. Sampler
3.3. SwinIR with Deformable Convolution
4. Experiments
4.1. Datasets
4.2. Implementation Detail
4.3. Quantitative Results
4.4. Ablation Study
4.5. Qualitative Results
4.6. Training and Validation Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi Morel, M.-L. Lowcomplexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the British Machine Vision Conference, London, UK, 3–7 September 2012; pp. 135.1–135.10. [Google Scholar] [CrossRef]
- Zhang, K.; Gao, X.; Tao, D.; Li, X. Single image super-resolution with non-local means and steering kernel regression. IEEE Trans. Image Process. 2012, 21, 4544–4556. [Google Scholar] [CrossRef] [PubMed]
- Gao, X.; Zhang, K.; Tao, D.; Li, X. Image super-resolution with sparse neighbor embedding. IEEE Trans. Image Process. 2012, 21, 3194–3205. [Google Scholar] [PubMed]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef] [PubMed]
- Kim, J.; Lee, J.K.; Lee, K.M. Deeply-recursive convolutional network for image super-resolution. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1637–1645. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
- Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar]
- Wang, H.; Chen, X.; Ni, B.; Liu, Y.; Liu, J. Omni aggregation networks for lightweight image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Lin, J.; Luo, X.; Hong, M.; Qu, Y.; Xie, Y.; Wu, Z. Memory-Friendly Scalable Super-Resolution via Rewinding Lottery Ticket Hypothesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photorealistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Change Loy, C. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018; pp. 1–16. [Google Scholar]
- Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 286–301. [Google Scholar]
- Zhang, K.; Liang, J.; Van Gool, L.; Timofte, R. Designing a practical degradation model for deep blind image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021. [Google Scholar]
- Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021. [Google Scholar]
- Mou, C.; Wu, Y.; Wang, X.; Dong, C.; Zhang, J.; Shan, Y. Metric learning based interactive modulation for real-world super-resolution. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022. [Google Scholar]
- Park, J.; Son, S.; Lee, K.M. Content-aware local gan for photo-realistic super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023. [Google Scholar]
- Lee, M.; Heo, J.-P. Noise-free optimization in early training steps for image super-resolution. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38. No. 4. [Google Scholar]
- Chen, H.; Wang, Y.; Guo, T.; Xu, C.; Deng, Y.; Liu, Z.; Ma, S.; Xu, C.; Xu, C.; Gao, W. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, DC, USA, 14–19 June 2020; pp. 12299–12310. [Google Scholar]
- Zhou, Y.; Li, Z.; Guo, C.L.; Bai, S.; Cheng, M.M.; Hou, Q. Srformer: Permuted self-attention for single image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023. [Google Scholar]
- Chen, Z.; Zhang, Y.; Gu, J.; Kong, L.; Yang, X.; Yu, F. Dual aggregation transformer for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023. [Google Scholar]
- Zhou, X.; Huang, H.; He, R.; Wang, Z.; Hu, J.; Tan, T. Msra-sr: Image super-resolution transformer with multi-scale shared representation acquisition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023. [Google Scholar]
- Chen, X.; Wang, X.; Zhou, J.; Qiao, Y.; Dong, C. Activating more pixels in image super-resolution transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Luo, X.; Xie, Y.; Qu, Y.; Fu, Y. SkipDiff: Adaptive Skip Diffusion Model for High-Fidelity Perceptual Image Super-resolution. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38. No. 5. [Google Scholar]
- Yuan, Y.; Yuan, C. Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38. No. 7. [Google Scholar]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 1833–1844. [Google Scholar]
- Deng, X.; Wang, H.; Xu, M.; Guo, Y.; Song, Y.; Yang, L. Lau-net: Latitude adaptive upscaling network for omnidirectional image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 9189–9198. [Google Scholar]
- Yoon, Y.; Chung, I.; Wang, L.; Yoon, K.J. Spheresr: 360deg image super-resolution with arbitrary projection via continuous spherical image representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5677–5686. [Google Scholar]
- Yu, F.; Wang, X.; Cao, M.; Li, G.; Shan, Y.; Dong, C. Osrt: Omnidirectional image super resolution with distortion-aware transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023. [Google Scholar]
- Sun, Y.; Lu, A.; Yu, L. Weighted-to-spherically uniform quality evaluation for omnidirectional video. IEEE Signal Process. Lett. 2017, 24, 1408–1412. [Google Scholar] [CrossRef]
- Zhou, Y.; Yu, M.; Ma, H.; Shao, H.; Jiang, G. Weighted-to-spherically-uniform ssim objective quality evaluation for panoramic video. In Proceedings of the 2018 14th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 12–16 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 54–57. [Google Scholar]
- Jaderberg, M.; Simonyan, K.; Zisserman, A. Spatial Transformer Networks. arXiv 2015, arXiv:1506.02025. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Xiao, J.; Ehinger, K.A.; Oliva, A.; Torralba, A. Recognizing scene viewpoint using panoramic place representation. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 2695–2702. [Google Scholar]
Scale | ×2 | |||||
---|---|---|---|---|---|---|
Method | ODI-SR | SUN 360 Panorama | ||||
PSNR | SSIM | LPIPS | PSNR | SSIM | LPIPS | |
Bicubic | 28.26 | 0.8216 | 0.353 | 28.54 | 0.8279 | 0.398 |
SRCNN [4] | 29.03 | 0.8452 | 0.342 | 29.26 | 0.8426 | 0.364 |
VDSR [8] | 30.12 | 0.8703 | 0.265 | 30.11 | 0.8733 | 0.302 |
RCAN [13] | 30.15 | 0.8725 | 0.226 | 30.52 | 0.8745 | 0.264 |
EDSR [7] | 30.32 | 0.8711 | 0.269 | 30.65 | 0.8720 | 0.279 |
ESRGAN [12] | 30.36 | 0.8769 | 0.205 | 30.85 | 0.8812 | 0.198 |
BSRGAN [14] | 30.32 | 0.8795 | 0.187 | 30.98 | 0.8839 | 0.175 |
Real-ESRGAN [15] | 30.59 | 0.8819 | 0.155 | 31.19 | 0.8825 | 0.196 |
SwinIR [26] | 30.54 | 0.8825 | 0.115 | 31.25 | 0.8846 | 0.132 |
LTM-SwinIR (ours) | 30.67 | 0.8836 | 0.102 | 31.39 | 0.8857 | 0.118 |
Scale | ×2 | |||
---|---|---|---|---|
Method | ODI-SR | SUN 360 Panorama | ||
WS-PSNR | WS-SSIM | WS-PSNR | WS-SSIM | |
Bicubic | 27.32 | 0.8059 | 28.50 | 0.8356 |
SRCNN [4] | 28.20 | 0.8312 | 28.96 | 0.8402 |
VDSR [8] | 29.56 | 0.8716 | 29.65 | 0.8736 |
RCAN [13] | 29.63 | 0.8669 | 29.41 | 0.8749 |
EDSR [7] | 29.65 | 0.8772 | 29.66 | 0.8768 |
ESRGAN [12] | 29.86 | 0.8769 | 29.79 | 0.8775 |
BSRGAN [14] | 30.25 | 0.8698 | 30.25 | 0.8799 |
Real-ESRGAN [15] | 30.36 | 0.8716 | 30.19 | 0.8859 |
SwinIR [26] | 30.32 | 0.8720 | 30.39 | 0.8886 |
LTM-SwinIR (ours) | 30.54 | 0.8799 | 30.62 | 0.8870 |
Scale | ×4 | |||||
---|---|---|---|---|---|---|
Method | ODI-SR | SUN 360 Panorama | ||||
PSNR | SSIM | LPIPS | PSNR | SSIM | LPIPS | |
Bicubic | 25.39 | 0.7089 | 0.574 | 25.29 | 0.7069 | 0.608 |
SRCNN [4] | 25.69 | 0.7319 | 0.428 | 26.16 | 0.7365 | 0.526 |
VDSR [8] | 26.75 | 0.7622 | 0.399 | 27.13 | 0.7639 | 0.422 |
RCAN [13] | 26.89 | 0.7599 | 0.352 | 27.22 | 0.7659 | 0.395 |
EDSR [7] | 27.08 | 0.7624 | 0.403 | 27.35 | 0.7709 | 0.355 |
ESRGAN [12] | 26.99 | 0.7689 | 0.326 | 27.39 | 0.7738 | 0.386 |
BSRGAN [14] | 27.26 | 0.7695 | 0.295 | 27.29 | 0.7729 | 0.308 |
Real-ESRGAN [15] | 27.32 | 0.7702 | 0.302 | 27.50 | 0.7755 | 0.226 |
SwinIR [26] | 27.36 | 0.7708 | 0.282 | 27.56 | 0.7795 | 0.256 |
LTM-SwinIR (ours) | 27.41 | 0.7726 | 0.203 | 27.99 | 0.7820 | 0.199 |
Scale | ×4 | |||
---|---|---|---|---|
Method | ODI-SR | SUN 360 Panorama | ||
WS-PSNR | WS-SSIM | WS-PSNR | WS-SSIM | |
Bicubic | 24.96 | 0.6985 | 25.38 | 0.7059 |
SRCNN [4] | 25.13 | 0.7256 | 26.02 | 0.7423 |
VDSR [8] | 26.16 | 0.7459 | 26.98 | 0.7812 |
RCAN [13] | 26.23 | 0.7449 | 27.12 | 0.7859 |
EDSR [7] | 26.44 | 0.7478 | 27.30 | 0.7860 |
ESRGAN [12] | 26.39 | 0.7502 | 27.35 | 0.7895 |
BSRGAN [14] | 26.41 | 0.7519 | 27.46 | 0.7899 |
Real-ESRGAN [15] | 26.49 | 0.7522 | 27.52 | 0.7906 |
SwinIR [26] | 26.61 | 0.7546 | 27.60 | 0.7915 |
LTM-SwinIR (ours) | 26.69 | 0.7553 | 27.82 | 0.7966 |
Scale | Component | ×4 | ||||
---|---|---|---|---|---|---|
Model | D-Conv | LTM | ODI-SR | SUN 360 Panorama | ||
WS-PSNR | WS-SSIM | WS-PSNR | WS-SSIM | |||
1 | x | x | 26.49 | 0.7509 | 27.09 | 0.7895 |
2 | x | √ | 26.68 | 0.7544 | 27.32 | 0.7933 |
3 | √ | √ | 26.76 | 0.7558 | 27.40 | 0.7956 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yao, X.; Pan, Y.; Wang, J. An Omnidirectional Image Super-Resolution Method Based on Enhanced SwinIR. Information 2024, 15, 248. https://doi.org/10.3390/info15050248
Yao X, Pan Y, Wang J. An Omnidirectional Image Super-Resolution Method Based on Enhanced SwinIR. Information. 2024; 15(5):248. https://doi.org/10.3390/info15050248
Chicago/Turabian StyleYao, Xiang, Yun Pan, and Jingtao Wang. 2024. "An Omnidirectional Image Super-Resolution Method Based on Enhanced SwinIR" Information 15, no. 5: 248. https://doi.org/10.3390/info15050248