RFANSR: Receptive Field Aggregation Network for Lightweight Remote Sensing Image Super-Resolution
Highlights
- RFANSR achieves superior performance with significantly reduced computational cost: 0.06 dB and 0.14 dB PSNR improvements on RSCNN7 and DOTA datasets, respectively, while using only 383 K parameters (45.4% reduction compared to DLKN’s 702 K parameters). Progressive Receptive Field Aggregator (PRFA) effectively expands receptive field through medium-sized kernels (7 × 7, 9 × 9, 11 × 11) while maintaining asymptotically Gaussian distribution, avoiding parameter redundancy of extremely large kernels.
- Progressive Receptive Field Aggregator (PRFA) effectively expands receptive field through medium-sized kernels (7 × 7, 9 × 9, 11 × 11) while maintaining asymptotically Gaussian distribution, avoiding parameter redundancy of extremely large kernels.
- The lightweight design (383 K parameters, 79.8 G FLOPs) makes high-quality remote sensing image super-resolution feasible on the resource-constrained edge devices and real-time processing scenarios.
- The Statistical Guidance Module (SGM) provides a new paradigm for channel utilization in lightweight networks, replacing inefficient identity mapping with minimal parameter overhead (5.5% increase) while achieving ~1 dB performance improvement.
Abstract
1. Introduction
- We propose the Progressive Receptive Field Aggregator (PRFA). It achieves efficient non-local feature modeling by progressively combining medium-sized convolutional kernels. This approach avoids the parameter redundancy caused by extremely large kernels.
- We design the Statistical Guidance Module (SGM) for lightweight global information interaction. It fully utilizes residual channels to improve channel utilization efficiency.
- Experiments on multiple benchmark datasets show that RFANSR outperforms existing methods while maintaining low computational cost. This makes it suitable for resource-constrained scenarios.
2. Related Work
2.1. Image Super-Resolution
2.2. Lightweight Super-Resolution
2.3. Large Kernel Convolution Methods
3. Proposed Method
3.1. Dual-Stream Receptive Block (DSRB)
3.2. Progressive Receptive Field Aggregator (PRFA)
3.3. Multi-Scale Distillation Block (MSDB)
3.4. Spatial-Gated Feedforward Network (SGFN)
3.5. Loss Function
4. Experiment
4.1. Experimental Results
4.2. Qualitative and Quantitative Experiments
| Scale | Method | Params | Flops | RSCNN7 | DOTA | WHU-RS19 | |||
|---|---|---|---|---|---|---|---|---|---|
| PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | ||||
| X2 | RLFN [35] | 527 K | 116 G | 32.39 | 0.8684 | 36.36 | 0.9361 | 35.48 | 0.9302 |
| RFDN [6] | 417 K | 91.3 G | 32.37 | 0.8680 | 36.44 | 0.9259 | 35.67 | 0.9303 | |
| FENet [21] | 351 K | 77.1 G | 32.35 | 0.8674 | 36.33 | 0.9357 | 35.66 | 0.9301 | |
| SMFANet [36] | 186 K | 38.8 G | 32.37 | 0.8679 | 36.46 | 0.9364 | 35.69 | 0.9303 | |
| SPAN [34] | 426 K | 94.3 G | 32.38 | 0.8681 | 36.19 | 0.9360 | 35.60 | 0.9301 | |
| DLKN [22] | 702 K | 159 G | 32.45 | 0.8699 | 36.45 | 0.9148 | 37.87 | 0.9149 | |
| OUR | 383 K | 79.8 G | 32.51 | 0.8723 | 36.59 | 0.9363 | 35.80 | 0.9321 | |
| X4 | RLFN [35] | 544 K | 29.8 G | 28.39 | 0.7241 | 29.68 | 0.8276 | 30.03 | 0.7990 |
| RFDN [6] | 433 K | 23.7 G | 28.40 | 0.7243 | 29.78 | 0.8291 | 30.06 | 0.7999 | |
| FENet [21] | 366 K | 20.1 G | 28.41 | 0.7236 | 29.85 | 0.8298 | 30.10 | 0.8001 | |
| SMFANet [36] | 197 K | 10.4 G | 28.38 | 0.7234 | 29.71 | 0.8265 | 30.04 | 0.7986 | |
| SPAN [34] | 426 K | 24.4 G | 28.38 | 0.7241 | 29.77 | 0.8285 | 30.01 | 0.7991 | |
| DLKN [22] | 721 K | 40.9 G | 28.39 | 0.7272 | 29.53 | 0.8124 | 29.72 | 0.7844 | |
| OUR | 399 K | 21.1 G | 28.43 | 0.7275 | 29.82 | 0.8303 | 30.13 | 0.8009 | |
| Scale | Method | Params | Set5 | Set14 | BSDS100 | Urban100 | Manga109 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | |||
| X2 | IMDN [19] | 694 K | 37.89 | 0.9606 | 33.51 | 0.9169 | 32.12 | 0.8990 | 32.00 | 0.9267 | 38.61 | 0.9770 |
| RFDN [6] | 417 K | 37.78 | 0.9606 | 33.35 | 0.9166 | 32.09 | 0.8991 | 31.79 | 0.9254 | 38.29 | 0.9764 | |
| RLFN [35] | 526 K | 37.88 | 0.9606 | 33.44 | 0.9168 | 32.13 | 0.8991 | 31.88 | 0.9259 | 38.39 | 0.9766 | |
| CFSR [33] | 298 K | 37.86 | 0.9605 | 33.44 | 0.9169 | 32.12 | 0.8992 | 31.77 | 0.9352 | 38.31 | 0.9764 | |
| SPAN [34] | 481 K | 37.94 | 0.9608 | 33.47 | 0.9165 | 32.14 | 0.8993 | 31.92 | 0.9265 | 38.30 | 0.9765 | |
| Ours | 383 K | 38.10 | 0.9613 | 33.79 | 0.9202 | 32.25 | 0.9008 | 32.52 | 0.9319 | 39.14 | 0.9780 | |
| X3 | IMDN [19] | 703 K | 34.36 | 0.9272 | 30.28 | 0.8412 | 29.05 | 0.8045 | 28.09 | 0.8504 | 33.48 | 0.9438 |
| RFDN [6] | 424 K | 34.18 | 0.9260 | 30.23 | 0.8406 | 29.02 | 0.8037 | 27.90 | 0.8475 | 33.23 | 0.9422 | |
| RLFN [35] | 533 K | 34.24 | 0.9266 | 30.26 | 0.8412 | 29.04 | 0.8412 | 27.99 | 0.8489 | 33.28 | 0.9426 | |
| CFSR [33] | 294 K | 34.23 | 0.9262 | 30.25 | 0.8406 | 29.04 | 0.8044 | 27.90 | 0.8475 | 33.30 | 0.9428 | |
| SPAN [34] | 417 K | 34.28 | 0.9268 | 30.27 | 0.8417 | 29.06 | 0.8049 | 28.04 | 0.8499 | 33.39 | 0.9436 | |
| Ours | 389 K | 34.37 | 0.9271 | 30.38 | 0.8434 | 29.12 | 0.8064 | 28.27 | 0.8552 | 33.69 | 0.9457 | |
| X4 | IMDN [19] | 715 K | 32.09 | 0.8942 | 28.54 | 0.7810 | 27.52 | 0.7340 | 25.96 | 0.7819 | 30.33 | 0.9063 |
| RFDN [6] | 433 K | 32.13 | 0.8943 | 28.50 | 0.7795 | 27.51 | 0.7339 | 25.92 | 0.7803 | 30.20 | 0.9051 | |
| RLFN [35] | 543 K | 31.97 | 0.8931 | 28.47 | 0.7795 | 27.51 | 0.7342 | 25.88 | 0.7803 | 30.12 | 0.9035 | |
| CFSR [33] | 303 K | 32.00 | 0.8930 | 28.49 | 0.7797 | 27.52 | 0.7343 | 25.84 | 0.7781 | 30.15 | 0.9045 | |
| SPAN [34] | 426 K | 32.08 | 0.8942 | 28.53 | 0.7810 | 27.55 | 0.7351 | 25.95 | 0.7812 | 30.34 | 0.9064 | |
| Ours | 399 K | 32.37 | 0.8974 | 28.75 | 0.7855 | 27.66 | 0.7400 | 26.44 | 0.7966 | 30.89 | 0.9132 | |
4.3. Complexity and Inference Efficiency
4.4. Ablation Studies
4.4.1. Effects of SGM
4.4.2. Effects of PRFA
4.4.3. Effects of SGFN
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, W.; Wei, W.; Zhang, L. GSDet: Object Detection in Aerial Images Based on Scale Reasoning. IEEE Trans. Image Process. 2021, 30, 4599–4609. [Google Scholar] [CrossRef] [PubMed]
- Yue, T.; Lu, X.; Cai, J.; Chen, Y.; Chu, S. YOLO-MST: Multiscale deep learning method for infrared small target detection based on super-resolution and YOLO. Opt. Laser Technol. 2025, 187, 112835. [Google Scholar] [CrossRef]
- Li, Y.; Chen, W.; Zhang, Y.; Tao, C.; Xiao, R.; Tan, Y. Accurate Cloud Detection in High-Resolution Remote Sensing Imagery by Weakly Supervised Deep Learning. Remote Sens. Environ. 2020, 250, 112045. [Google Scholar] [CrossRef]
- Xie, Z.; Wang, J.; Song, W.; He, Q.; Zhang, M.; Chang, B. HAMD-RSISR: Hybrid Attention and Multi-Dictionary for Remote Sensing Super-Resolution. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 22556–22572. [Google Scholar] [CrossRef]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Tang, J.; Wu, G. Residual feature distillation network for lightweight image super-resolution. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 41–55. [Google Scholar]
- Li, Z.; Liu, Y.; Chen, X.; Cai, H.; Gu, J.; Qiao, Y.; Dong, C. Blueprint separable residual network for efficient image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022, New Orleans, LA, USA, 21–24 June 2022; pp. 833–843. [Google Scholar]
- Sun, L.; Pan, J.; Tang, J. Shufflemixer: An efficient convnet for image super-resolution. Adv. Neural Inf. Process. Syst. 2022, 35, 17314–17326. [Google Scholar]
- Hu, Q.; Tang, Y.; Zhang, X. Large Kernel Modulation Network for Efficient Image Super-Resolution. arXiv 2025, arXiv:2508.11893. [Google Scholar] [CrossRef]
- Lee, D.; Yun, S.; Ro, Y. Partial large kernel CNNS for efficient super-resolution. arXiv 2024, arXiv:2404.11848. [Google Scholar] [CrossRef]
- Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 1646–1654. [Google Scholar]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2017, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Dai, T.; Cai, J.; Zhang, Y.; Xia, S.T.; Zhang, L. Second-order attention network for single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Niu, B.; Wen, W.; Ren, W.; Zhang, X.; Yang, L.; Wang, S. Single image super-resolution via a holistic attention network. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Cham, Switzerland, 2020. [Google Scholar]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Chen, X.; Wang, X.; Zhou, J.; Qiao, Y.; Dong, C. Activating more pixels in image super-resolution transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 22367–22377. [Google Scholar]
- Long, W.; Zhou, X.; Zhang, L.; Gu, S. Progressive Focused Transformer for Single Image Super-Resolution. In Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA, 11–15 June 2025. [Google Scholar]
- Chen, C.F.R.; Fan, Q.; Panda, R. Crossvit: Cross-attention multi-scale vision transformer for image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 357–366. [Google Scholar]
- Hui, Z.; Gao, X.; Yang, Y.; Wang, X. Lightweight image super-resolution with information multi-distillation network. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2024–2032. [Google Scholar]
- Wang, Z.; Gao, G.; Li, J.; Yan, H.; Zheng, H.; Lu, H. Lightweight feature de-redundancy and self-calibration network for efficient image super-resolution. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 19, 1–15. [Google Scholar] [CrossRef]
- Wang, Y.; Shao, Z.; Lu, T.; Wu, C.; Wang, J. Remote sensing image super-resolution via multiscale enhancement network. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5000905. [Google Scholar] [CrossRef]
- Liu, Y.; Lan, C.; Feng, W. DLKN: Enhanced lightweight image super-resolution with dynamic large kernel network. Vis. Comput. 2025, 41, 3627–3644. [Google Scholar] [CrossRef]
- Li, A.; Zhang, L.; Liu, Y.; Zhu, C. Feature modulation transformer: Cross-refinement of global representation via high-frequency prior for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 12514–12524. [Google Scholar]
- Chen, J.; Kao, S.; He, H.; Zhuo, W.; Wen, S.; Lee, C.-H.; Chan, S.-H.G. Run, don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 12021–12031. [Google Scholar]
- Lee, H.J.; Kim, H.E.; Nam, H. Srm: A style-based recalibration module for convolutional neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1854–1862. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Zou, Q.; Ni, L.; Zhang, T.; Wang, Q. Deep Learning Based Feature Selection for Remote Sensing Scene Classification. IEEE Geosci. Remote Sens. Lett. 2015, 12, 2321–2325. [Google Scholar] [CrossRef]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3974–3983. [Google Scholar]
- Li, J.; Fang, F.; Mei, K.; Zhang, G. Multi-scale residual network for image super-resolution. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 517–532. [Google Scholar]
- Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vancouver, BC, Canada, 7–14 July 2001; IEEE Computer Society: Washington, DC, USA, 2001; pp. 416–423. [Google Scholar]
- Huang, J.-B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; IEEE Computer Society: Washington, DC, USA, 2015; pp. 5197–5206. [Google Scholar]
- Matsui, Y.; Ito, K.; Aramaki, Y.; Fujimoto, A.; Ogawa, T.; Yamasaki, T.; Aizawa, K. Sketch-based manga retrieval using manga109 dataset. Multimed. Tools Appl. 2017, 76, 21811–21838. [Google Scholar] [CrossRef]
- Xie, X.; Zhou, P.; Li, H.; Lin, Z.; Yan, S. Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 9508–9520. [Google Scholar] [CrossRef] [PubMed]
- Wan, C.; Yu, H.; Li, Z.; Chen, Y.; Zou, Y.; Liu, Y.; Yin, X.; Zuo, K. Swift parameter-free attention network for efficient super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024. [Google Scholar]
- Kong, F.; Li, M.; Liu, S.; Liu, D.; He, J.; Bai, Y.; Chen, F.; Fu, L. Residual local feature network for efficient super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 766–776. [Google Scholar]
- Zheng, M.; Sun, L.; Dong, J.; Pan, J. SMFANet: A lightweight self-modulation feature aggregation network for efficient image super-resolution. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer Nature: Cham, Switzerland, 2024; pp. 359–375. [Google Scholar]
- Zhang, J.; Lei, J.; Xie, W.; Fang, Z.; Li, Y.; Du, Q. SuperYOLO: Super resolution assisted object detection in multimodal remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5605415. [Google Scholar] [CrossRef]








| Method | Params | Flops | Inference Time (ms) | Peak Memory (MB) | PSNR | Managa109 |
|---|---|---|---|---|---|---|
| RLFN | 383 K | 29.8 G | 25.0 | 102 | 29.68 | 29.54 |
| DLKN | 366 K | 40.9 G | 44.4 | 186 | 29.53 | 27.88 |
| Ours | 363 K | 21.1 G | 38.8 | 155 | 29.82 | 27.66 |
| Params | Set5 | Set14 | BSDS100 | Urban100 | Managa109 | |
|---|---|---|---|---|---|---|
| With SGM | 383 K | 31.69 | 28.28 | 27.36 | 25.43 | 29.54 |
| With SEBlock | 366 K | 30.66 | 27.63 | 27.10 | 24.61 | 27.88 |
| Without SGM | 363 K | 30.55 | 27.59 | 26.97 | 24.58 | 27.66 |
| Params | Set5 | Set14 | BSDS100 | Urban100 | Managa109 | |
|---|---|---|---|---|---|---|
| [5, 7, 9] | 376 K | 31.20 | 27.99 | 27.18 | 25.00 | 28.60 |
| [7, 9, 11] | 383 K | 31.69 | 28.28 | 27.36 | 25.43 | 29.54 |
| [9, 11, 13] | 426 | 30.38 | 27.39 | 26.85 | 24.34 | 27.27 |
| Params | Set5 | Set14 | BSDS100 | Urban100 | Managa109 | |
|---|---|---|---|---|---|---|
| MLP | 395 K | 31.34 | 27.83 | 26.93 | 25.06 | 29.17 |
| SimpleGate | 410 K | 31.48 | 28.08 | 27.09 | 25.18 | 29.28 |
| ConFNN | 391 K | 31.57 | 28.13 | 27.22 | 25.30 | 29.43 |
| SGFN | 383 K | 31.69 | 28.28 | 27.36 | 25.43 | 29.54 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yan, X.; Song, W.; Feng, X.; Guo, W.; Ning, K. RFANSR: Receptive Field Aggregation Network for Lightweight Remote Sensing Image Super-Resolution. Remote Sens. 2025, 17, 4028. https://doi.org/10.3390/rs17244028
Yan X, Song W, Feng X, Guo W, Ning K. RFANSR: Receptive Field Aggregation Network for Lightweight Remote Sensing Image Super-Resolution. Remote Sensing. 2025; 17(24):4028. https://doi.org/10.3390/rs17244028
Chicago/Turabian StyleYan, Xiaoyu, Wei Song, Xiaotong Feng, Wei Guo, and Keqing Ning. 2025. "RFANSR: Receptive Field Aggregation Network for Lightweight Remote Sensing Image Super-Resolution" Remote Sensing 17, no. 24: 4028. https://doi.org/10.3390/rs17244028
APA StyleYan, X., Song, W., Feng, X., Guo, W., & Ning, K. (2025). RFANSR: Receptive Field Aggregation Network for Lightweight Remote Sensing Image Super-Resolution. Remote Sensing, 17(24), 4028. https://doi.org/10.3390/rs17244028

