Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss
Abstract
:1. Introduction
- In order to improve the flexibility of model inference, this paper proposes a method of constructing a Gram matrix for patches to formulate another supervision except for GT. This supervision ignores the position information of images and focuses only on texture information, which can reduce the generation of distorted structures with a large deviation from GT.
- We propose a discriminator perceptual loss dedicated to the SR task based on the two-network architecture of generative adversarial networks (GAN), which can give the network some additional inference logic from the SR perspective compared with traditional perceptual loss.
- Massive advanced perception-driven methods are used to compare their performance with Gram-GAN to demonstrate the advancement of the proposed method, and ablation experiments are performed to verify the respective necessity of the constructed extra supervision and discriminator perceptual loss.
2. Related Work
2.1. PSNR-Oriented Methods
2.2. Perception-Driven Methods
3. Methods
3.1. Extra Supervision Based on Gram Matrix
3.2. Discriminator Perceptual Loss
3.3. Other Loss Functions
3.3.1. Perceptual Loss
3.3.2. Adversarial Loss
3.3.3. Content Loss
3.3.4. Overall Loss
4. Experiments
4.1. Datasets and Similarity Measures
4.2. Training Details
4.3. Comparison with State-of-the-Art Technologies
4.3.1. Quantitative Results
4.3.2. Qualitative Results
4.4. Ablation Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chao, D.; Chen, C.L.; He, K.; Tang, X. Learning a Deep Convolutional Network for Image Super-Resolution. In Proceedings of the ECCV, Zurich, Switzerland, 6–12 September 2014. [Google Scholar]
- Kim, J.; Lee, J.K.; Lee, K.M. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Computer Society, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. arXiv 2018, arXiv:1807.02758. [Google Scholar]
- Hu, X.; Mu, H.; Zhang, X.; Wang, Z.; Tan, T.; Sun, J. Meta-SR: A Magnification-Arbitrary Network for Super-Resolution. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Li, Z.; Yang, J.; Liu, Z.; Yang, X.; Wu, W. Feedback Network for Image Super-Resolution. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Hussein, S.A.; Tirer, T.; Giryes, R. Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Zhou, W.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 9–12 November 2003. [Google Scholar]
- Gupta, P.; Srivastava, P.; Bhardwaj, S.; Bhateja, V. A modified PSNR metric based on HVS for quality assessment of color images. In Proceedings of the 2011 International Conference on Communication and Industrial Application, Kolkata, India, 26–28 December 2012. [Google Scholar]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual Losses for Real-Time Style Transfer and Super-Resolution; Springer: Cham, Switzerland, 2016. [Google Scholar]
- Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Loy, C.C. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. arXiv 2018, arXiv:1809.00219. [Google Scholar]
- Rad, M.S.; Bozorgtabar, B.; Marti, U.V.; Basler, M.; Ekenel, H.K.; Thiran, J.P. SROBB: Targeted Perceptual Loss for Single Image Super-Resolution. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Soh, J.W.; Gu, Y.P.; Jo, J.; Cho, N.I. Natural and Realistic Single Image Super-Resolution With Explicit Natural Manifold Discrimination. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Rakotonirina, N.C.; Rasoanaivo, A. ESRGAN+: Further Improving Enhanced Super-Resolution Generative Adversarial Network. In Proceedings of the ICASSP 2020—IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 3637–3641. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Li, W.; Zhou, K.; Qi, L.; Lu, L.; Jiang, N.; Lu, J.; Jia, J. Best-Buddy GANs for Highly Detailed Image Super-Resolution. Proc. AAAI 2022, 36, 1412–1420. [Google Scholar] [CrossRef]
- Alex Krizhevsky, I.S.; Hinton, G.E. Best-Buddy GANs for Highly Detailed Image Super-Resolution. In Proceedings of the NeuriPS, Lake Tahoe, NV, USA, 3–6 December 2012. [Google Scholar]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A Discriminative Feature Learning Approach for Deep Face Recognition. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Jolicoeur-Martinea, A. Deep residual learning for image recognition. In Proceedings of the ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Agustsson, E.; Timofte, R. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 126–135. [Google Scholar]
- Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi-Morel, M.L. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the 23rd British Machine Vision Conference (BMVC), Surrey, UK, 3–7 September 2012. [Google Scholar]
- Zeyde, R.; Elad, M.; Protter, M. On single image scale-up using sparse-representations. In Proceedings of the International Conference on Curves and Surfaces, Avignon, France, 24–30 June 2010; Springer: Cham, Switzerland, 2010; pp. 711–730. [Google Scholar]
- Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on Computer Vision, ICCV 2001, Vancouver, BC, Canada, 7–14 July 2001; Volume 2, pp. 416–423. [Google Scholar]
- Huang, J.B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5197–5206. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 586–595. [Google Scholar]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
DataSet | Metric | Bicubic | SRGAN [3] | ESRGAN [11] | SFTGAN [13] | ESRGAN+ [15] | Beby-GAN [17] | Gram-GAN (Ours) |
---|---|---|---|---|---|---|---|---|
Set5 | PSNR | 26.69 | 26.69 | 26.50 | 27.26 | 25.88 | 27.82 | 27.97 |
SSIM | 0.7736 | 0.7813 | 0.7565 | 0.7765 | 0.7511 | 0.8004 | 0.8021 | |
LPIPS | 0.3644 | 0.1305 | 0.1080 | 0.1028 | 0.1178 | 0.0875 | 0.0867 | |
NIQE | 29.56 | 24.58 | 18.75 | 26.87 | 19.45 | 25.40 | 21.34 | |
Set14 | PSNR | 26.08 | 25.88 | 25.52 | 26.29 | 25.01 | 26.86 | 26.96 |
SSIM | 0.7467 | 0.7480 | 0.7175 | 0.7397 | 0.7159 | 0.7691 | 0.7710 | |
LPIPS | 0.3870 | 0.1421 | 0.1254 | 0.1177 | 0.1362 | 0.1009 | 0.1003 | |
NIQE | 25.22 | 18.60 | 15.19 | 16.71 | 16.09 | 18.45 | 17.27 | |
BSD100 | PSNR | 26.07 | 24.65 | 24.95 | 25.71 | 24.62 | 26.13 | 26.32 |
SSIM | 0.7177 | 0.7063 | 0.6785 | 0.7065 | 0.6893 | 0.7347 | 0.7376 | |
LPIPS | 0.4454 | 0.1622 | 0.1428 | 0.1357 | 0.1446 | 0.1192 | 0.1202 | |
NIQE | 24.35 | 19.64 | 16.27 | 17.23 | 17.76 | 21.05 | 18.53 | |
Urban100 | PSNR | 24.73 | 24.04 | 24.21 | 25.04 | 23.98 | 25.72 | 25.89 |
SSIM | 0.7101 | 0.7209 | 0.7045 | 0.7314 | 0.7182 | 0.7652 | 0.7679 | |
LPIPS | 0.4346 | 0.1534 | 0.1354 | 0.1259 | 0.1334 | 0.1066 | 0.1076 | |
NIQE | 20.63 | 14.93 | 12.52 | 13.12 | 13.38 | 15.76 | 14.28 |
Metric | Set5 | Set14 | BSD100 | Urban100 | |||
---|---|---|---|---|---|---|---|
PSNR | ✓ | 27.72 | 26.69 | 26.06 | 25.59 | ||
✓ | ✓ | 27.96 | 26.97 | 26.24 | 25.79 | ||
✓ | ✓ | ✓ | 27.97 | 26.96 | 26.32 | 25.89 | |
SSIM | ✓ | 0.7967 | 0.7647 | 0.7290 | 0.7593 | ||
✓ | ✓ | 0.8016 | 0.7709 | 0.7353 | 0.7654 | ||
✓ | ✓ | ✓ | 0.8021 | 0.7710 | 0.7376 | 0.7679 | |
LPIPS | ✓ | 0.0891 | 0.1031 | 0.1215 | 0.1099 | ||
✓ | ✓ | 0.0883 | 0.1021 | 0.1205 | 0.1079 | ||
✓ | ✓ | ✓ | 0.0867 | 0.1003 | 0.1202 | 0.1076 | |
NIQE | ✓ | 22.21 | 18.34 | 19.32 | 14.70 | ||
✓ | ✓ | 24.36 | 20.32 | 19.17 | 14.65 | ||
✓ | ✓ | ✓ | 21.34 | 17.27 | 18.53 | 14.28 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, J.; Yi, H.; Xu, W.; Li, B.; Li, X. Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss. Sensors 2023, 23, 2098. https://doi.org/10.3390/s23042098
Song J, Yi H, Xu W, Li B, Li X. Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss. Sensors. 2023; 23(4):2098. https://doi.org/10.3390/s23042098
Chicago/Turabian StyleSong, Jie, Huawei Yi, Wenqian Xu, Bo Li, and Xiaohui Li. 2023. "Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss" Sensors 23, no. 4: 2098. https://doi.org/10.3390/s23042098
APA StyleSong, J., Yi, H., Xu, W., Li, B., & Li, X. (2023). Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss. Sensors, 23(4), 2098. https://doi.org/10.3390/s23042098