Semantic Super-Resolution of Text Images via Self-Distillation
Abstract
:1. Introduction
- We introduce a semantic loss to measure the difference between text SR images.
- We propose a semantic SR method using self-distillation that forces semantically similar text SR images by minimizing semantic loss.
- The proposed method can be applied to one of the existing SR models without modifying its network structure and is used as it is for SR of other categories’ images.
- The performance of the proposed method is validated on different text image datasets in various aspects. The experiments show that the proposed method outperforms a GAN-based semantic SR method.
2. Related Work
2.1. Self-Distillation
2.2. Enhanced Deep Super-Resolution Network (EDSR)
2.3. Deep Learning-Based Text Image Super-Resolution
2.4. Semantic Super-Resolution
3. Proposed Method
Algorithm 1: Semantic SR via self-distillation |
|
4. Experimental Results and Discussion
4.1. Setup
4.2. Results and Discussion
4.3. Limitations
5. Conclusions and Future Work
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
SR | Super-resolution |
HR | High-resolution |
LR | Low-resolution |
CNN | Convolutional neural network |
GAN | Generative adversarial network |
LSTM | Long short-term memory |
EDSR | Enhanced deep super-resolution network |
ResNet | Residual neural network |
PSNR | Peak signal-to-noise ratio |
SSIM | Structural similarity index measure |
References
- Yang, J.; Wright, J.; Huang, T.S.; Ma, Y. Image Super-Resolution via Sparse Representation. IEEE Trans. Image Process. 2010, 19, 2861–2873. [Google Scholar] [CrossRef] [PubMed]
- Tai, Y.W.; Liu, S.; Brown, M.S.; Lin, S.C.F. Super resolution using edge prior and single image detail synthesis. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2400–2407. [Google Scholar]
- Yamashita, R.; Nishio, M.; Do, R.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Into Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, Z.; Chen, J.; Hoi, S.H. Deep Learning for Image Super-Resolution: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3365–3387. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xu, X.; Sun, D.; Pan, J.; Zhang, Y.; Pfister, H.; Yang, M.H. Learning to Super-Resolve Blurry Face and Text Images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) 2017, Venice, Italy, 22–29 October 2017; pp. 251–260. [Google Scholar] [CrossRef]
- Wang, W.; Xie, E.; Liu, X.; Wang, W.; Liang, D.; Shen, C.; Bai, X. Scene Text Image Super-Resolution in the Wild. In Proceedings of the European Conference on Computer Vision 2020, Glasgow, UK, 23–28 August 2020; pp. 650–666. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, Honolulu, HI, USA, 22–25 July 2017; pp. 105–114. [Google Scholar] [CrossRef] [Green Version]
- Chen, J.; Li, B.; Xue, X. Scene Text Telescope: Text-Focused Scene Image Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021, Virtual, 19–25 June 2021; pp. 12021–12030. [Google Scholar] [CrossRef]
- Ma, J.; Guo, S.; Zhang, L. Text Prior Guided Scene Text Image Super-resolution. arXiv 2021, arXiv:2106.15368. [Google Scholar]
- Xu, T.B.; Liu, C.L. Data-Distortion Guided Self-Distillation for Deep Neural Networks. Proc. AAAI Conf. Artif. Intell. 2019, 33, 5565–5572. [Google Scholar] [CrossRef] [Green Version]
- Yun, S.; Park, J.; Lee, K.; Shin, J. Regularizing Class-Wise Predictions via Self-Knowledge Distillation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 13873–13882. [Google Scholar] [CrossRef]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1132–1140. [Google Scholar] [CrossRef] [Green Version]
- Luo, Z.; Huang, Y.; Li, S.; Wang, L.; Tan, T. Learning the Degradation Distribution for Blind Image Super-Resolution. arXiv 2022, arXiv:2203.04962. [Google Scholar]
- Zhang, Y.; Wang, H.; Qin, C.; Fu, Y. Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning. In Proceedings of the Tenth International Conference on Learning Representations 2022, Virtual, 25 April 2022. [Google Scholar]
- Dong, C.; Zhu, X.; Deng, Y.; Loy, C.C.; Qiao, Y. Boosting Optical Character Recognition: A Super-Resolution Approach. arXiv 2015, arXiv:1506.02211. [Google Scholar]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a Deep Convolutional Network for Image Super-Resolution. In Proceedings of the European Conference on Computer Vision 2014, Zurich, Switzerland, 6–12 September 2014; pp. 184–199. [Google Scholar]
- Pandey, R.K.; Vignesh, K.; Ramakrishnan, A.G.; Chandrahasa, B. Binary Document Image Super Resolution for Improved Readability and OCR Performance. arXiv 2018, arXiv:1812.02475. [Google Scholar]
- Tran, H.T.M.; Ho-Phuoc, T. Deep Laplacian Pyramid Network for Text Images Super-Resolution. arXiv 2018, arXiv:1811.10449. [Google Scholar]
- Chen, J.; Haiyang, Y.; Jianqi, M.; Li, B.; Xue, X. Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; pp. 285–293. [Google Scholar]
- Nakaune, S.; Iizuka, S.; Fukui, K. Skeleton-aware Text Image Super-Resolution. In Proceedings of the 32nd British Machine Vision Conference, Online, 22–25 November 2021. [Google Scholar]
- Sun, J.; Zhu, J.; Tappen, M. Context-Constrained Hallucination for Image Super-Resolution. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 231–238. [Google Scholar] [CrossRef] [Green Version]
- Timofte, R.; Smet, V.; Van Gool, L. Semantic super-resolution: When and where is it useful? Comput. Vis. Image Underst. 2015, 142, 1–12. [Google Scholar] [CrossRef] [Green Version]
- Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 606–615. [Google Scholar]
- Chen, C.; Shi, X.; Qin, Y.; Li, X.; Han, X.; Yang, T.; Guo, S. Blind Image Super Resolution with Semantic-Aware Quantized Texture Prior. arXiv 2022, arXiv:2202.13142. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Available online: https://github.com/krasserm/super-resolution (accessed on 13 December 2021).
- Available online: https://aihub.or.kr/aidata/30753 (accessed on 13 December 2021).
- Agustsson, E.; Timofte, R. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1122–1131. [Google Scholar] [CrossRef]
- Available online: https://fki.tic.heia-fr.ch/databases (accessed on 7 July 2022).
- Available online: https://www.unifr.ch/inf/diva/en/research/software-data/diva-hisdb.html (accessed on 7 July 2022).
OBCC-OCR | IAM-HistDB | DIVA-HisDB | DIV2K | |
---|---|---|---|---|
w/ | 32.3158/0.9379 | 24.8446/0.6823 | 36.6961/0.9796 | 26.2992/0.7129 |
w/o | 32.2944/0.9375 | 24.5205/0.6756 | 36.6293/0.9731 | 26.2070/0.6946 |
OBCC-OCR | IAM-HistDB | DIVA-HisDB | DIV2K | |
---|---|---|---|---|
w/ | 30.8653/0.9236 | 24.8410/0.6844 | 36.6691/0.9732 | 28.5653/0.8122 |
w/o | 30.9723/0.9245 | 25.0024/0.6872 | 36.6763/0.9728 | 28.6086/0.8133 |
OBCC-OCR | IAM-HistDB | DIVA-HisDB | |
---|---|---|---|
32.2945/0.9371 | 24.6445/0.6780 | 36.6183/0.9736 | |
32.2430/0.9366 | 24.5076/0.6673 | 36.0823/0.9479 |
OBCC-OCR | IAM-HistDB | DIVA-HisDB | |
---|---|---|---|
Multi-class GAN [5] | 32.5117/0.9379 | 24.5745/0.6807 | 36.3281/0.9791 |
Proposed | 32.3158/0.9379 | 24.8446/0.6823 | 36.6961/0.9796 |
Difference | −0.1959/0.0000 | 0.2701/0.0016 | 0.3680/0.0005 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Park, H. Semantic Super-Resolution of Text Images via Self-Distillation. Electronics 2022, 11, 2137. https://doi.org/10.3390/electronics11142137
Park H. Semantic Super-Resolution of Text Images via Self-Distillation. Electronics. 2022; 11(14):2137. https://doi.org/10.3390/electronics11142137
Chicago/Turabian StylePark, Hanhoon. 2022. "Semantic Super-Resolution of Text Images via Self-Distillation" Electronics 11, no. 14: 2137. https://doi.org/10.3390/electronics11142137
APA StylePark, H. (2022). Semantic Super-Resolution of Text Images via Self-Distillation. Electronics, 11(14), 2137. https://doi.org/10.3390/electronics11142137