Next Article in Journal
Neonatal Activity Monitoring by Camera-Based Multi-LSTM Network
Previous Article in Journal
Prevention of Burnout Syndrome in Social Workers to Increase Professional Self-Efficacy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Lightweight Network for Single Image Super-Resolution with Arbitrary Scale Factor †

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 70142, Taiwan
*
Author to whom correspondence should be addressed.
Presented at the IEEE 5th Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability, Tainan, Taiwan, 2–4 June 2023.
Eng. Proc. 2023, 55(1), 15; https://doi.org/10.3390/engproc2023055015
Published: 28 November 2023

Abstract

:
The existing single image super-resolution (SISR) methods that consider integer scale factors (X2, X3, X4, and X8), have been developed well, but SISR methods with arbitrary scale factors (X1.3, X2.5, and X3.7) have gradually gained attention recently. Therefore, we proposed an efficient, lightweight model. In this study, there are two contributions as follows. (1) An efficient and lightweight network for SISR is combined with the up-scaled module, which determines its weights based on the size of the high-resolution (HR) image. (2) All scale factors are applied simultaneously using one model, which saves more storage and computational resources. Finally, we design various experiments to evaluate the proposed method based on multiple general datasets. The experimental results show that the proposed model is lightweight while the performance is relatively competitive.

1. Introduction

In recent years, convolutional neural networks (CNN) have become one of the most ubiquitous machine learning solutions for computer vision tasks. CNN is used intentionally in most fields of image processing, and single image super-resolution (SISR) is one of them. SISR was known as the up-scaling method in the past, which indicates generating a high-resolution (HR) image from a single low-resolution (LR) image. The SISR techniques [1,2,3,4,5,6] based on CNN have been developed since SRCNN [1], but most of them only consider the integer scale factors (X2, X3, X4, and X8), as shown in Figure 1a. Since there are situations where users need to up-scale low-resolution (LR) images to customize the size instead of fixing the size in real-world scenarios, the SISR methods with arbitrary scale factors (X1.3, X2.5, and X3.7) become important. In addition, if we train a single specific model with every scale factor, it saves time and effort, as shown in Figure 1b. Putting all the reasons together, the researchers have come up with the idea to solve that problem.
Meta Super-Resolution (Meta-SR) [2] was proposed in 2019. Their Meta-Upscale Module is applied to up-scaling LR based on different scale factors. In contrast, the up-scaling module of the SISR methods with the integer scale factors is the deconvolution layer or sub-pixel layer at the end of networks. In particular, the sub-pixel layer [3] is widely used in SR works, such as the Residual Dense Network (RDN) [4] and residual channel attention network (RCAN) [5]. Meta-SR adopts RDN [4] as the backbone and proves that Meta-SR obtains high performance and deals with arbitrary scale factors for SISR. However, Meta-SR [2] has high complexity, and implementing it involves many challenges in terms of the hardware requirements, making it computationally expensive.
Therefore, the proposed method is focused on constructing a lightweight model, which is more appropriate and likely to work in real-life scenarios. The proposed model is inspired by Meta-SR [2] and is called Light Arbitrary-SR (LAS), which is much lighter in weight than the original Meta-SR [2]. Compared to a similar study [6] of very deep super-resolution (VDSR) with arbitrary scale factors, the research result shows a better quality of HR images with lower usage of weights and computational cost.

2. Proposed Method

The proposed LAS is inspired by RCAN [5] and Meta-SR [2]. We found an efficient and lightweight network as the backbone based on RCAN [5] and combined it with the Meta-Upscale Module [2], as shown in Figure 2. One novelty in RCAN [5] is to establish a very deep network based on the residual in residual (RIR) structure. This network comprises several residual groups and long skip connections. Each group consists of multiple residual blocks and short skip connections.
Generally, RIR makes the main network concentrate on learning high-frequency information by allowing plentiful low-frequency information to be surpassed via numerous skip connections. The channel attention mechanism is also introduced to improve the representational ability of the network further. The dominant part of the network is the residual channel attention block (RCAB), which helps the network recognize informative components of the LR features efficiently. The RCAB, inspired by the success of channel attention (CA) and residual blocks (RB), helps the network learn and explore more information to improve the overall performance. RCAN is constructed using the foundation of RCAB and RIR structure.
However, a very deep RCAN brings about higher accuracy and superior results for the SR image. Thus, we aimed to build a low-complexity network for SR images with arbitrary scale factors, as the RCAN is still too complicated with a high computational cost, which makes it challenging to implement. In the original RCAN, 20 RCABs and 10 residual groups are set up, and the usage of total weights is about 16 M. To reduce the complexity and make it more appropriate for hardware implementation, we reduced it to around 90% of the entire implementation to only 3 or 6 RCABs and a single residual group.
Moreover, another highlight in the proposed LAS is the use of the Meta-Upscale Module, which has three core functions: Location Projection, Weight Prediction, and Feature Mapping. The block of Location Projection projects pixels of the HR image onto the LR image based on the scale factor and the kernel weights for each pixel on the HR image are predicted by the Weight Prediction Module. Lastly, the feature maps on the LR image and the predicted kernel weights are mapped back to the HR image using the Feature Mapping function to compute the value of the pixel of the HR image. We attempted to simplify the Meta-Upscale Module as well. Since the Weight Prediction function of the Meta-Upscale Module uses a network to predict the kernel weights using two fully connected layers, it consumes a lot of computational resources. We experimented with alleviating the neurons from 256 to 128 and then to 64 to observe their performance. Finally, the Meta-Upscale Module was simplified by reducing the number of neurons in the fully connected layer from 256 to 64. The proposed method is confirmed to be a lightweight SR method with arbitrary scale factors.

3. Experimental Result

To achieve a lightweight model of super-resolution with non-integer scale factors, we attempted to combine the Meta-Upscale Module and RCAN and simplified them. In the experiment, three versions were presented, LAS_A, LAS_B, LAS_C, and Meta-RCAN, based on a different setting. For LAS_A, we implemented three RCABs. A single residual block with a simplified Meta-Upscale Module reduced the number of neurons in the fully connected layer to 64. For LAS_B, there were six RCABs and a single residual block with a simplified Meta-Upscale Module reduced the number of neurons in the fully connected layer to 64. LAS_C contained six RCABs and one residual block with 256 fully connected layers in the Meta-Upscale Module. Lastly, Meta-RCAN indicates the use of a slightly simplified RCAN, which was set up with 16 RCABs and 10 residual groups. The setting of Meta-RCAN was adopted from the official source code in Ref. [2]. We re-trained the model and presented the test results, and we did not consider Meta-RCAN as one of the versions of LAS.
All the experiments were run in parallel on two GPUs (Nvidia GeForce GTX 1080 Ti). We used the Pytorch framework and Python 3 with CUDA (version 11.2.142). The training and testing required libraries, including Pytorch 0.5.0, Python 3.5 or higher, NumPy, skiamge, imageio, and cv2. The training scale factors for the proposed methods varied from 1, 1.1, 1.2, 1.3, 1.4, … to 4 with a stride of 0.1. The training dataset contained 800 images from the DIV2K [7] dataset. The test dataset was from on three datasets: Set5 [8], Set14 [9], and B100 [10]. For other details, the learning rate was decreased by half after every 200 epochs with an initialization of 10-4 for all the layers. The optimizer is Adam. For better convergence, the L1 loss function, instead of the L2, was used to train the network.
Since RCAN has a better representational ability of the model than RDN, the Meta-RCAN shows a similar value of the evaluated metric with around 40% lower parameters than Meta-RDN (Table 1). Moreover, LAS_C has more fully connected layers in the Meta-Upscale Module, so it has approximately 30% higher values of parameters than LAS_B. However, for the evaluated metric, LAS_B only has a slightly lower value than LAS_C. In the comparison of LAS_B and LAS_A, the more parameters there are, the better the quality of the image.
Compared to the lightweight VDSR, the evaluated metric of LAS_B is slightly higher, but it still requires relatively fewer parameters. For LAS_A, its parameters are 33% lower than those of VDSR and obtain almost similar evaluated metrics. The results of VDSR are obtained from the original data in Ref. [6], and the results of Meta-RDN [2] are obtained from the pre-trained model created by us.
We present the generated HR images made by LAS_B with several scale factors in Figure 3. The comparison of the proposed methods with others for practically generated x2.0, x3.0, and x4.0 HR images are provided in Figure 4, Figure 5 and Figure 6, respectively. Finally, there is a trade-off between the performance, which is evaluated using PSNR and SSIM metrics, and the cost is assessed using the parameters. The proposed LAS_A with only 400 K parameters makes the proposed model reasonable and realistic for implementation into hardware devices. In particular, it considers non-integer scale factors.

4. Conclusions

Super-resolution with non-integer scale factors is still a practical topic that has gradually gained attention in recent years. Meta-SR [2] is used for tackling this problem. A novel upscale module is proposed to proactively predict the kernel weights based on the corresponding scale factor. Based on this particular design, we need to train only a single model for all arbitrary scale factors. It saves time and effort compared to traditional training to train a specific model for each scale factor. However, it is still computationally expensive. Inspired by Ref. [2], we built a lightweight network, which is suitable for hardware applications. The main contribution of the proposed work is the creation of a single model for each arbitrary scale factor with a low computational cost. Its network is trained from scratch and only needs to be prepared once for all the scale factors.

Author Contributions

Conceptualization, Q.T.D.D. and K.-Y.H.; methodology, Q.T.D.D. and K.-Y.H.; software, Q.T.D.D.; validation, Q.T.D.D.; formal analysis, Q.T.D.D.; investigation, K.-Y.H.; resources, K.-Y.H.; data curation, K.-Y.H.; writing—original draft preparation, Q.T.D.D.; writing—review and editing, K.-Y.H.; visualization, K.-Y.H.; supervision, P.-Y.C.; project administration, P.-Y.C.; funding acquisition, P.-Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Science and Technology Council, R.O.C., under NSTC-110-2221-E-006-164-MY3, in part by National Academy of Marine Research, Taiwan, under NAMR-111001, and in part by Qualcomm through a Taiwan University Research Collaboration Project.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to thank anonymous editors for their valuable comments and suggestions to improve the quality of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dong, C.; Loy, C.C.; He, K.; Tang, X. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef] [PubMed]
  2. Hu, X.; Mu, H.; Zhang, X.; Wang, Z.; Tan, T.; Sun, J. Meta-SR: A magnification-arbitrary network for super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
  3. Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
  4. Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
  5. Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
  6. Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
  7. Agustsson, E.; Timofte, R. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1122–1131. [Google Scholar]
  8. Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi-Morel, M.-L. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the British Machine Vision Conference 2012, Surrey, UK, 3–7 September 2012; pp. 1–10. [Google Scholar] [CrossRef]
  9. Zeyde, R.; Elad, M.; Protter, M. On single image scale-up using sparse-representations. In Proceedings of the 7th international conference on Curves and Surfaces, Avignon, France, 24–30 June 2010; pp. 711–730. [Google Scholar]
  10. Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on Computer Vision, Vancouver, BC, Canada, 7–14 July 2001. [Google Scholar]
Figure 1. (a) Multiple SR model for different scale factors and (b) single SR model for the arbitrary scale factor.
Figure 1. (a) Multiple SR model for different scale factors and (b) single SR model for the arbitrary scale factor.
Engproc 55 00015 g001
Figure 2. Architecture of the Meta-RCAN network.
Figure 2. Architecture of the Meta-RCAN network.
Engproc 55 00015 g002
Figure 3. Generated HR image of “ppt3” from Set14 dataset.
Figure 3. Generated HR image of “ppt3” from Set14 dataset.
Engproc 55 00015 g003
Figure 4. Visual comparison of the image “Monarch” from dataset Set14 with a scale factor of 2.
Figure 4. Visual comparison of the image “Monarch” from dataset Set14 with a scale factor of 2.
Engproc 55 00015 g004
Figure 5. Visual comparison of the image “zebra” from dataset Set14 with a scale factor of 3.
Figure 5. Visual comparison of the image “zebra” from dataset Set14 with a scale factor of 3.
Engproc 55 00015 g005
Figure 6. Visual comparison of the image “Baboon” from dataset Set14 with a scale factor of 4.
Figure 6. Visual comparison of the image “Baboon” from dataset Set14 with a scale factor of 4.
Engproc 55 00015 g006
Table 1. Experimental results of the proposed method and comparison with other methods for the evaluated metric PSNR/SSIM.
Table 1. Experimental results of the proposed method and comparison with other methods for the evaluated metric PSNR/SSIM.
BicubicVDSR [6]LAS_ALAS_BLAS_CMeta-RCANMeta-RDN [2]
ParamsN/A665 K411 K634 K967 K12.7 M22 M
Set5233.66/0.929937.53/0.958737.52/0.958337.67/0.959137.72/0.959338.22/0.961138.23/0.9610
2.5--35.36/0.939535.60/0.941135.59/0.941036.20/0.944436.18/0.9441
330.39/0.868233.66/0.921333.72/0.920934.03/0.923834.02/0.924034.73/0.929534.72/0.9296
3.5--32.52/0.901932.81/0.905832.91/0.907133.56/0.914233.60/0.9146
428.42/0.810431.35/0.883831.36/0.880731.72/0.887131.81/0.888632.52/0.898932.51/0.8986
Set14230.24/0.868833.03/0.912433.05/0.912333.26/0.914933.27/0.914934.02/0.920634.03/0.9204
2.5--31.17/0.870431.37/0.874031.37/0.873531.97/0.881931.89/0.8814
327.55/0.774229.77/0.831429.83/0.831930.06/0.836430.06/0.836630.58/0.846330.58/0.8465
3.5--28.86/0.797029.08/0.802329.11/0.803529.59/0.813929.60/0.8140
426.00/0.702728.01/0.767428.05/0.767228.28/0.773528.31/0.774928.84/0.787228.86/0.7878
B100229.56/0.843131.90/0.896031.81/0.894031.97/0.896631.99/0.897132.33/0.900832.36/0.9011
2.5--29.95/0.841530.11/0.845030.12/0.844630.46/0.850930.48/0.8509
327.21/0.738528.82/0.797628.75/0.795928.90/0.799928.92/0.800529.26/0.807929.28/0.8089
3.5--27.89/0.756628.04/0.761728.06/0.762628.40/0.771828.42/0.7730
425.96/0.667527.29/0.725127.22/0.723327.37/0.728927.40/0.730127.73/0.740927.76/0.7419
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dang, Q.T.D.; Huang, K.-Y.; Chen, P.-Y. Lightweight Network for Single Image Super-Resolution with Arbitrary Scale Factor. Eng. Proc. 2023, 55, 15. https://doi.org/10.3390/engproc2023055015

AMA Style

Dang QTD, Huang K-Y, Chen P-Y. Lightweight Network for Single Image Super-Resolution with Arbitrary Scale Factor. Engineering Proceedings. 2023; 55(1):15. https://doi.org/10.3390/engproc2023055015

Chicago/Turabian Style

Dang, Quang Truong Duy, Kuan-Yu Huang, and Pei-Yin Chen. 2023. "Lightweight Network for Single Image Super-Resolution with Arbitrary Scale Factor" Engineering Proceedings 55, no. 1: 15. https://doi.org/10.3390/engproc2023055015

Article Metrics

Back to TopTop