Next Article in Journal
Improving Deceptive Patch Solutions Using Novel Deep Learning-Based Time Analysis Model for Industrial Control Systems
Next Article in Special Issue
High-Frequency Workpiece Image Recognition Model Based on Hybrid Attention Mechanism
Previous Article in Journal
Design of a Multi-Vision System for a Three-Dimensional Mug Shot Model to Improve Forensic Facial Identification
Previous Article in Special Issue
Remote Sensing Image Change Detection Based on Deep Learning: Multi-Level Feature Cross-Fusion with 3D-Convolutional Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images

1
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
2
Key Laboratory of Lunar and Deep Space Exploration, Chinese Academy of Sciences, Beijing 100101, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(20), 9286; https://doi.org/10.3390/app14209286
Submission received: 14 August 2024 / Revised: 27 September 2024 / Accepted: 9 October 2024 / Published: 12 October 2024
(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)

Abstract

:
High-resolution optical images will provide planetary geology researchers with finer and more microscopic image data information. In order to maximize scientific output, it is necessary to further increase the resolution of acquired images, so image super-resolution (SR) reconstruction techniques have become the best choice. Aiming at the problems of large parameter quantity and high computational complexity in current deep learning-based image SR reconstruction methods, we propose a novel Recursive Swin Transformer Super-Resolution Network (RSTSRN) for SR applied to images. The RSTSRN improves upon the LapSRN, which we use as our backbone architecture. A Residual Swin Transformer Block (RSTB) is used for more efficient residual learning, which consists of stacked Swin Transformer Blocks (STBs) with a residual connection. Moreover, the idea of parameter sharing was introduced to reduce the number of parameters, and a multi-scale training strategy was designed to accelerate convergence speed. Experimental results show that the proposed RSTSRN achieves superior performance on 2×, 4× and 8×SR tasks to state-of-the-art methods with similar parameters. Especially on high-magnification SR tasks, the RSTSRN has great performance superiority. Compared to the LapSRN network, for 2×, 4× and 8× Mars image SR tasks, the RSTSRN network has increased PSNR values by 0.35 dB, 0.88 dB and 1.22 dB, and SSIM values by 0.0048, 0.0114 and 0.0311, respectively.

1. Introduction

Exploration is the driving force for the development of human civilization and social progress. In the course of human exploration, space exploration can most directly expand the territory of human cognition and is extremely challenging. If we can understand the history of Mars, mankind’s view of the entire solar system may change. By studying the morphological characteristics of the Martian surface, we can gain a deep understanding of the formation and evolution process of the Martian surface and the evolutionary history of Martian geology [1,2,3,4]. Therefore, the primary task of Mars exploration is to obtain global high-resolution optical images.
The earliest global images of Mars were obtained in the early 1970s with a resolution of only 1 km. The Viking 1 and Viking 2 orbiters in the late 1970s obtained global images of Mars with a resolution of 100∼200 m. The Mars Global Surveyor (MGS) and Mars Express have obtained global images of Mars with a resolution of several meters. It took humans 30 years to increase the resolution of images of Mars by hundreds of times. The High-Resolution Imaging Science Experiment (HiRISE) [5,6,7,8,9] of the Mars Reconnaissance Orbiter (MRO) launched in 2005 has further increased the resolution by 10 times, and it obtained Mars remote sensing images with a resolution of up to 0.3 m. HiRISE’s excellent resolution has created many new research topics for planetary geology researchers: how the stratigraphic layers in the interior of Mars are formed; what are the new impact craters on the surface of Mars; what are the dynamic changes on the surface of Mars (such as how sand dunes move). The high-resolution images provided by HiRISE can help researchers explore Martian plates, volcanoes, climate and seasonal effects as well as understand the mysteries of impact crater formation, magma alluvial and glaciation on Mars.
Various physical processes have left rich and colorful patterns on the surface of Mars. High-resolution imaging data can help researchers explore the mysteries of Martian plates, volcanoes, climate, and seasonal effects as well as the formation of impact craters, magma accretion, and glaciation on Mars. However, too high resolution also comes at a cost. On the one hand, the volume of obtained data is too large, and it takes a long time to send it back to the earth due to the limitation of data downlink bandwidth. On the other hand, the field of view is limited, and the images obtained so far only cover about a few percent of the Martian surface.
In order to maximize scientific output, it is necessary to further increase the resolution of acquired images, so image super-resolution (SR) reconstruction techniques have become the best choice. At the same time, the SR will also provide planetary geology researchers with finer and more microscopic image data information so as to open up new research topics and further understand the formation and evolution process of the Martian surface and the evolution history of Martian geology.
The types of surface features on Mars are monotonous with no vegetation or other features except for bare soil and gravel. Compared with natural images, Mars images have more monotonous texture details and equally monotonous color information. This is extremely challenging for image super-resolution reconstruction, because the monotonous image information is difficult for feature learning mechanisms to obtain the high-frequency information required for super-resolution reconstruction.
In this work, we propose a novel Recursive Swin Transformer Super-Resolution Network (RSTSRN) for image SR. The RSTSRN improves upon the LapSRN (Laplacian pyramid Super-Resolution Network) [10] in three aspects: (1) it uses Residual Swin Transformer Block (RSTB) for more effective residual learning; (2) the idea of parameter sharing was introduced to reduce the number of parameters; (3) a multi-scale training strategy was designed to accelerate convergence speed. The shifted windowing scheme (Shifted Window Multi-head Self Attention, SW-MSA) brings greater efficiency. Based on RSTB, high-frequency features contained in images can be fully learned to better reconstruct super-resolution images. Transformer is a model that is completely based on an attention mechanism and does not include convolution. In the multi-head self-attention mechanism, the input is linearly projected onto multiple feature subspaces and processed in parallel by multiple independent attention heads, and then the cascaded result vector is mapped to the final output. We train the RSTSRN model and demonstrate 2×, 4×, and 8× resolution enhancement for images.

2. Related Works

SR algorithms have been applied in different fields, such as remote sensing [11,12,13,14,15], medical image processing [16,17,18], facial recognition [19,20,21], etc. Based on the number of input LR images in the reconstruction process, SR algorithms can be roughly divided into two categories: Multiple-Image SR (MISR) algorithms [22,23,24] and Single-Image SR (SISR) algorithms (also known as learning-based methods) [25,26,27].

2.1. Multiple-Image SR Algorithms

MISR algorithms can fully utilize the basic and non-redundant information contained in each frame of sequential LR images.
The Projection Onto Convex Sets (POCS) algorithm [28,29,30] has the ability to flexibly integrate various spatial observation models, motion models, and degraded models. The POCS algorithm has a strong ability to utilize prior information as well as the ability to maintain details and edges. However, the solution of this type of algorithm is usually not unique, and the reconstruction result has a strong dependence on the initial value. Moreover, the projection process requires a large amount of computation, and the convergence speed is slow. The convergence stability also needs to be further improved. The Iterative Back-Projection algorithm [31,32] is intuitive in theory and easy to implement. However, due to the ill-posed SR reconstruction problem, the solution of the IBP algorithm is not unique. The Maximum A Posteriori (MAP) algorithm [22,33,34,35,36] can fully consider the spatial domain prior information to ensure the stability of the final solution, and it has strong anti-noise performance. However, this type of algorithm has high computational complexity and easily leads to blurred edge details. The hybrid MAP/POCS algorithm [37] considers the convex set constraints and statistical features of the LR image sequence at the same time, thus obtaining the respective advantages of the two algorithms. The quality of the reconstructed image is better than the POCS algorithm and the MAP algorithm but only by using gradient descent optimization methods can convergence be guaranteed.
If the industry wants to fundamentally break through the limitations of MISR algorithms, it still needs to seek new ideas and methods.

2.2. Single-Image SR Algorithms

The basic idea of the SISR algorithm is to obtain the mapping relationship between HR images and LR images through learning, which is used to guide the reconstruction of new HR images. SISR algorithms are further divided into shallow learning-based SR algorithms [11,25,38] and deep learning-based SR algorithms [39,40,41,42].

2.2.1. Shallow Learning-Based SR Algorithms

Classic shallow learning-based algorithms include example-based algorithms [11,25] and sparse-representation-based algorithms [38].
In the reconstruction stage, the example-based algorithm uses the low-frequency information block of the LR image to be reconstructed as an index to search for similar samples in the sample library, and then it uses the high-frequency information corresponding to the matching samples to perform SR reconstruction. This type of algorithm needs to establish a huge sample library, and each LR block needs to search in the sample library, resulting in very high computational complexity. Moreover, due to the mismatching problem of the searched sample blocks, artifacts will be generated at the edges of the reconstructed image blocks, which will seriously affect the reconstruction quality. In order to eliminate artifacts, the reconstruction is performed by overlapping blocks. Averaging high-frequency information of overlapping regions leads to over-smoothed edges in the reconstructed image.
The sparse-representation-based algorithms perform sparse representation on the sample library composed of LR and HR sample image blocks. The sparse coefficients obtained by the corresponding LR block and HR block through their respective dictionaries are relatively similar, thus establishing the connection between LR and HR. This type of algorithm obtains better reconstruction quality, but the sparse coding and reconstruction process require multiple iterations, resulting in high algorithm complexity.
Shallow learning-based algorithms are mainly divided into three stages: feature extraction, learning, and reconstruction. Each stage is independently designed and optimized, and the feature extraction and expressive capabilities of the learning model are limited.

2.2.2. Deep Learning-Based SR Algorithms

The emergence of deep learning technology has made up for the shortcomings of traditional shallow learning algorithms. At present, deep learning-based SR algorithms have become the mainstream algorithms in the industry.
Dong et al. applied the convolutional neural networks to image SR reconstruction for the first time (SR Convolutional Neural Network, SRCNN) [42], and they divided the network into three stages: image block extraction, nonlinear mapping, and image reconstruction, and then these three stages are unified into a CNN (Convolutional Neural Network) framework to realize end-to-end learning from LR images to HR images.
Classic deep learning SR networks include FSRCNN (Fast SRCNN) [43], ESPCN (Efficient Sub-Pixel Convolutional neural Network) [44], VDSR (Very Deep networks for SR) [45], DRCN (Deeply Recursive Convolutional Network) [46], SCN (Sparse-Coding-based Network) [47], DEGREE (Deep Edge Guided REcurrent rEsidual) [48], MemNet (Memory Network) [49], LapSRN [10], DRRN (Deep Recursive Residual Network) [50], SRDenseNet (Super-Resolution Dense convolutional Network) [51], SRGAN (Super-Resolution Generative Adversarial Network) [52], RDN (Residual Dense Network) [53], SwinIR [54], NGswin (N-Gram Swin Trans-former) [55], DAT (Dual Aggregation Transformer) [56], etc.
In recent years, some deep learning-based SR algorithms for Lunar and Mars images have been published and achieved good results [13,27,39,40,57,58,59,60,61]. This provides a useful reference for this work. The author’s team also made some beneficial explorations in the aspect of remote sensing image SR [11,62,63] and carried out research work in several aspects such as network structure and lightweight, loss function and how to realize unsupervised learning. We previously proposed Lightweight Laplacian Pyramid Recursive and Residual Network (LRN) in [64] for Mars images.

3. Materials and Methods

3.1. RSTSRN Architecture

Inspired by the mechanism of parameter sharing and recursive proposed in DRCN [46], we propose a novel Recursive Swin Transformer Super-Resolution Network (RSTSRN) combining the Swin Transformer [65] with the Laplacian Pyramid introduced in the LapSRN. Figure 1 shows our proposed RSTSRN network architecture; our goal is to estimate 2×/4×/8× SR images from an input LR image.
The Residual Image Extraction Block (RIEB) starts from a 3 × 3 convolutional layer, which can be formulated as
x 1 s f = f I F E ( I L R ) ,
where I L R denotes the LR input to the network, f I F E denotes the feature extraction function for I L R , and x 1 s f denotes the feature map extracted from LR.
Then, x 1 s f passed into the next RSTBs and other layers of the RIEB, as shown in equation
I 2 × R = f R I ( x 1 s f ) ,
I 2 × R = f R I E B ( I L R ) = f R I ( f I F E ( I L R ) ) ,
where I 2 × R denotes the residual image extracted from LR by the first RIEB.
As shown in Figure 1, each RIEB contains 6 RSTBs. RSTB is based on the original RSTB and STB structures that were used in SwinIR [54] and Swin Transformer [65]; we have added the global skip connection that was used within the RIEB, and non-homologous skip connections were used for each RSTB to focus on residual learning. This design reduces the difficulty of learning residual images and improves the convergence efficiency of the network.
After the nonlinear feature mapping, we use the Pixelshuffle layer [44] for up-sampling. The input size of the Pixelshuffle layer is consistent with the size of the LR for each RIEB. The combination of a pair of convolutional layers and a LReLU layer before the Pixelshuffle layer is used to improve the extraction ability of deep features. The convolutional layer after the Pixelshuffle layer is used to improve the reconstruction quality of residual images. The 2×, 4×, and 8× SR image reconstruction processes can be expressed as
I 2 × SR = I 2 × R + f u p ( I LR ) ,
I 2 × SR = I 2 × R + I 2 × LR ,
I 4 × SR = I 4 × R + f u p ( I 2 × SR ) ,
I 4 × SR = I 4 × R + I 4 × LR ,
I 8 × SR = I 8 × R + f u p ( I 4 × SR ) ,
I 8 × SR = I 8 × R + I 8 × LR .
The up-sampling method ( f u p ) from I L R to I 2 × LR , I 2 × SR to I 4 × LR , and I 4 × SR to I 8 × LR is all bicubic interpolation (scale = 2).
In an RSTSRN, the key to generating high-quality SR images is the inference of residual images. The residual image should contain more high-frequency information. Replace up-sampling from I 2 × LR to I 4 × LR and I 4 × LR to I 8 × LR with I 2 × SR to I 4 × LR and I 4 × SR to I 8 × LR , respectively; then, the high-frequency information of I 2 × SR and I 4 × SR is better utilized in residual learning.
In an RSTSRN, parameters sharing is designed at two levels: one is between RIEBs and the other is between RSTBs. As shown in Figure 1, the RSTSRN consists of a three-level pyramid structure where each level performs 2× SR tasks with consistent functionality. Therefore, parameter sharing can be carried out between levels, where the convolutional layers, RSTBs, and the Pixelshuffle layer corresponding to each level use the same parameters. RSTBs are the core structure of the RSTSRN network, which is responsible for inferring residual images from shallow features for SR image reconstruction. In an RSTSRN, each RSTB shares the same parameters. However, STBs in RSTB do not share parameters.

3.2. Loss Functions and Multi-Scale Training Strategy

RSTSRN uses the L 1 loss function for parameter optimization, which can be formulated as
L S 1 = 1 N i = 1 N j = 1 L | I H R j i I S R j i | ,
L M S = L 2 1 + L 4 1 + L 8 1 ,
where L M S represents the multi-scale loss function, S denotes the scale of the RSTSRN network, N denotes the number of images per batch, L denotes the number of pyramid levels, and I H R j i and I S R j i represent the HR image and the SR image, respectively.
We designed a special multi-scale training strategy for the progressive up-sampling framework of the pyramid structure used in an RSTSRN. An 8× RSTSRN network can produce 2×, 4×, and 8× outputs in one forward propagation, while a 4× RSTSRN network can produce 2× and 4× outputs in one forward propagation, each result being a superposition of similar 2× SR reconstruction processes. Multi-scale training refers to simultaneously training three RSTSRN networks with SR scales of 8×, 4×, and 2×. Each forward propagation can produce three 2× outputs, two 4× outputs, and one 8× output. If the final output of three networks (i.e., 8× output of 8× RSTSRN network, 4× output of 4× RSTSRN network, and 2× output of 2× RSTSRN network) has the same spatial resolution; then, three 2× SR results have different spatial resolutions, and the same applies to two 4× SR results. Conducting these tasks simultaneously during training can promote each other and accelerate convergence. The specific approach is to add up the three network loss functions before backpropagation to generate a final multi-scale loss function as shown in Equation (11) and then perform backpropagation parameter optimization for the three RSTSRN networks. The multi-scale training process is shown in Figure 2.

3.3. Shifted Window

This article not only limits the self-attention mechanism to non-overlapping local windows but also enables cross-window attention mechanism applications by shifted windows, improving the efficiency of utilizing window edge information and fully learning high-frequency features contained in images. An illustration of the shifted window approach for computing self-attention is shown in Figure 3 [65]. In layer l, self-attention is computed within each window. In the next layer l+1, the window partitioning is shifted, resulting in new windows. The self-attention computation in the new windows crosses the boundaries of the previous windows in layer l.

3.4. Training and Testing

Our experimental platform is a computer with a CPU of i9-10850K@3.60GHz and a GPU of NVIDIA RTX3090 with 24 GB of GPU memory. The operating system of the experimental platform is Win10, the deep learning framework is Pytorch 1.8, and the CUDA is 12.0.
We first train our network on the DIVerse 2K resolution image dataset (DIV2K) [66] to observe our algorithm’s performance on natural images. We carry out extensive experiments using 5 datasets: Set5 [67], Set14 [68], BSDS100 [69], Urban100 [70] and MANGA109 [71].
We also train our network on 5000 pairs of HiRISE and down-sampled HiRISE cropped samples. The training HR samples (512 × 512) were extracted from ESP_066115_2055 (https://www.uahirise.org/ESP_066115_2055, accessed on 18 March 2022); examples of HiRISE images are shown in Figure 4. The 1500 testing images were extracted from ESP_066194_2100 (https://www.uahirise.org/ESP_066194_2100, accessed on 18 March 2022) and ESP_066828_2050 (https://www.uahirise.org/ESP_066828_2050, accessed on 18 March 2022).
In the training phase, the batch size is set to 20 and the ADAM optimizer is adopted with the setting of β 1 = 0.9, β 2 = 0.999, and ϵ = 10−8. MultiStep is adopted as the learning rate decay strategy, and the learning rate is initialized to 2 × 10−4, the multiplication factor is set to 0.5, and the milestones are set to 100,000, 160,000, 180,000, 190,000 and 200,000 iterations.

4. Results

4.1. RSTSRN Results for Natural Images

We compare the proposed RSTSRN with 18 SR algorithms: VDSR [45], DRCN [46], MemNet [49], LapSRN [10], EDSR (Enhanced Deep Super-Resolution) [72], CARN (CAscading Residual Network) [73], D-DBPNs (Dense Deep Back-Projection Networks) [74], MSRN (Multi-Scale Residual Network) [75], IMDN (Information Multi-Distillation Network) [76], AWSRN (Adaptive Weighted Super-Resolution Network) [77], RFDN (Residual Feature Distillation Network) [78], LatticeNet [79], HNCT (Hybrid Network of CNN and Transformer) [80], ELAN (Efficient Long-range Attention Network) [81], ESRT (Efficient Super-Resolution Transformer) [82], SPIN (Super Token Interaction Network) [83], DiVANet (Directional Variance Attention Network) [84] and NGswin [55]. We evaluate the SR images with Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM) [85]. Table 1 shows quantitative comparisons for 2×, 4× and 8× SR.
As shown in Table 1, the proposed RSTSRN achieves superior performance (optimal value or near-optimal values) on 2×, 4× and 8× SR tasks to state-of-the-art methods with similar parameters. The RSTSRN obtains the highest PSNR and SSIM values in all five datasets tested at the 4× and 8×SR tasks, and it has a large lead over other algorithms, indicating that our RSTSRN has great performance superiority in the current lightweight methods. We fully verified the effectiveness of RSTB for high-frequency feature learning on high magnification super-resolution reconstruction tasks.
For SR tasks with different multiples, most algorithms will modify the local details of the network model accordingly, and they mainly focus on 2×, 3× and 4× SR tasks. There are relatively few algorithms that focus on the 8 × SR task. Table 2 shows quantitative comparisons with the number of parameters of lightweight SR algorithms on multi-scale tasks. Unlike other lightweight SR reconstruction networks, the RSTSRN uses a progressive up-sampling framework based on the Laplacian image pyramid and an interstage parameter sharing mechanism. Therefore, the same parameter set is used in 2×, 4× and 8× SR tasks, which makes the RSTSRN demonstrate higher lightweight performance in multi-scale reconstruction tasks.
The test results of some algorithms on the benchmark test dataset are as follows. The results of the LapSRN, AWSRN, and D-DBPN with reconstruction capabilities at three scales of 2×, 4×, and 8× are presented. In the comparison of 2× and 4× reconstruction results, a CARN with parameters close to an RSTSRN was selected. In the comparison of 4× reconstruction results, the EDSR with the second-best performance and the largest number of parameters was selected. In the comparison of 4× and 8× reconstruction results, D-DBPN has the second largest number of parameters. In addition, D-DBPN achieved suboptimal results in 8× reconstruction and also achieved good results in 4× reconstruction.
Figure 5 shows the local magnification visual effects of an image named “barbara” in the Set14 dataset at 2×SR. It is obvious that other algorithms except for the RSTSRN and AWSRN have not been able to fully reconstruct the black and white stripes of the scarf in the image. However, the reconstruction results of the AWSRN at the upper end of the enlarged area did not match the original image, so only the RSTSRN completed the complete reconstruction of the area.
Figure 6 shows the visual effect of two local magnifications of an image named “butterfly” in the Set5 dataset at 4× SR. For the first area on the upper right, the notch shape of the white patch in the original image can only be reflected to a certain extent in the SR reconstruction results of the CARN and RSTSRN, while the results of the RSTSRN are closer to the geometric shape of the original image compared to the CARN. For the second region, the shape of the notch in the original image can be reconstructed by most algorithms, but only the reconstruction effect of the RSTSRN can reflect the sharp angle of the notch in the original image. In summary, the RSTSRN demonstrated the best visual performance in the 4× SR reconstruction task of a “butterfly”.
Figure 7 shows the local magnification visual effects of an image named “img_040” in the Urban100 dataset at 8× SR. It is evident that only the RSTSRN has completed the correct reconstruction of this area.

4.2. RSTSRN Results for Mars Image

The natural image reconstruction effect proves the superiority of the RSTSRN. In this subsection, we only compare the proposed RSTSRN with the LapSRN [10]. Table 3 shows quantitative comparisons for 2×, 4× and 8× SR.
Compared to the LapSRN network, for the 2×, 4× and 8×SR tasks, the RSTSRN network proposed in this article has increased PSNR values by 0.35 dB, 0.88 dB and 1.22 dB, and SSIM values by 0.0048, 0.0114 and 0.0311, respectively. The objective evaluation indicators have been significantly improved. Figure 8, Figure 9 and Figure 10 show the comparison of subjective visual effects and objective indicators of several Mars images using Bicubic, LapSRN, and RSTSRN algorithms under 2×, 4× and 8× SR reconstruction. It can be seen that the RSTSRN proposed in this article outperforms the prototype algorithm LapSRN in terms of vision. The edge and texture details in the SR reconstructed image of the RSTSRN are clearer and sharper, and the overall details of the SR reconstructed images are closer to the original HR images. The comparative experiment fully verified the effectiveness of RSTB for high-frequency feature learning.
The RSTSRN can not only be used for supporting image analysis to obtain improved scientific understanding of the Martian surface, but it can also be used for supporting existing and future missions. SR images can be employed for a wide range of applications such as the selection of landing sites for future Mars landing missions, the planning of Mars rover exploration paths, and the manned Mars program.

5. Conclusions

In this paper, we introduce a lightweight and efficient image SR reconstruction network RSTSRN based on the Swin Transformer module using the Laplace image pyramid as a framework. The efficiency of the proposed network is mainly due to the following reasons: (1) We use the Laplace image pyramid as the network framework, which breaks the hard 8× SR task into three simple 2× SR tasks so that 8× SR can be performed with a lightweight network RSTSRN. (2) We adopt the parameters-sharing mechanism between the pyramid levels and between the recursive blocks to maintain a low-parameter quantity. (3) We replace the convolution layers with more efficient Swin Transformer modules and adopt the skip connection mechanism to enable the network to achieve higher performance while maintaining low computational complexity.
Despite the positive outcomes of this study, there are some limitations. For example, in terms of the number of parameters, it is still relatively large compared to the latest methods such as SPIN, DiVANet, NGswin, etc., and further lightweight design is needed. To address the limitations of the current work, future research can be conducted in the following directions: one is to further optimize the network structure to further reduce the number of parameters; another is to train and test on more diverse datasets, such as images obtained by ExoMars Trace Gas Orbiter (TGO)’s Color and Stereo Surface Imaging System (CaSSIS) and High-Resolution Imaging Camera (HiRIC) on China’s First Mars Exploration Tianwen-1 Mission to enhance the model’s generalization ability. Through these research approaches, we hope to elevate the super-resolution reconstruction technology for Mars images to a higher level.

Author Contributions

Conceptualization, F.W. and X.J.; methodology, F.W.; software, F.W. and T.F.; validation, F.W., T.F. and Y.F.; formal analysis, F.W., T.F. and D.X.; investigation, F.W., Y.F. and C.Z.; writing—original draft preparation, F.W. and D.X.; writing—review and editing, F.W. and C.Z.; visualization, F.W.; project administration, F.W.; funding acquisition, F.W. and X.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Jilin Province, grant number 20220101168JC, as well as funded by the National Natural Science Foundation of China, grant number 42001345, and funded by the Key Laboratory of Lunar and Deep Space Exploration, Chinese Academy of Sciences, grant number LDSE201901.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AWSRNAdaptive Weighted Super-Resolution Network
CARNCAscading Residual Network
CNNConvolutional Neural Network
DATDual Aggregation Transformer
DEGREEDeep Edge Guided REcurrent rEsidual
DIV2KDIVerse 2K resolution image dataset
DiVANetDirectional Variance Attention Network
DRCNDeeply Recursive Convolutional Network
DRRNDeep Recursive Residual Network
D-DBPNDense Deep Back-Projection Networks
EDSREnhanced Deep Super-Resolution
ELANEfficient Long-range Attention Network
ESPCNEfficient Sub-Pixel Convolutional neural Network
ESRTEfficient Super-Resolution Transformer
FSRCNNFast Super-Resolution Convolutional Neural Network
HiRISEHigh-Resolution Imaging Science Experiment
HNCTHybrid Network of CNN and Transformer
HRHigh Resolution
IBPIterative Back-Projection
IMDNInformation Multi-Distillation Network
LapSRNLaplacian pyramid Super-Resolution Network
LRLow-Resolution
LRNLaplacian Pyramid Recursive and Residual Network
MAPMaximum A Posteriori
MemNetMemory Network
MGSMars Global Surveyor
MISRMultiple-Image Super-Resolution
MROMars Reconnaissance Orbiter
MSRNMulti-Scale Residual Network
NGswinN-Gram Swin Transformer
POCSProjection Onto Convex Sets
PSNRPeak Signal-to-Noise Ratio
RDNResidual Dense Network
RED-Netvery deep Residual Encoder–Decoder Networks
RFDNResidual Feature Distillation Network
RIEBResidual Image Extraction Block
RSTBResidual Swin Transformer Block
RSTSRNRecursive Swin Transformer Super-Resolution Network
SCNSparse-Coding-based Network
ScSRSparse-coding-based SR
SISRSingle-Image Super-Resolution
SPINSuper Token Interaction Network
SRSuper-Resolution
SRCNNSuper-Resolution Convolutional Neural Network
SRDenseNetSuper-Resolution Dense convolutional Network
SRGANSuper-Resolution Generative Adversarial Network
SSIMStructural SIMilarity
STBSwin Transformer Block
VDSRVery Deep networks for Super-Resolution

References

  1. Bell, J.F.; Squyres, S.W.; Arvidson, R.E.; Arneson, H.M.; Bass, D.; Blaney, D.; Cabrol, N.; Calvin, W.; Farmer, J.; Farrand, W.H.; et al. Pancam Multispectral Imaging Results from the Spirit Rover at Gusev Crater. Science 2004, 305, 800–806. [Google Scholar] [CrossRef]
  2. Bell, J.F.; Squyres, S.W.; Arvidson, R.E.; Arneson, H.M.; Bass, D.; Calvin, W.; Farrand, W.H.; Goetz, W.; Golombek, M.; Greeley, R.; et al. Pancam Multispectral Imaging Results from the Opportunity Rover at Meridiani Planum. Science 2004, 306, 1703–1709. [Google Scholar] [CrossRef]
  3. Blake, D.F.; Morris, R.V.; Kocurek, G.; Morrison, S.M.; Downs, R.T.; Bish, D.; Ming, D.W.; Edgett, K.S.; Rubin, D.; Goetz, W.; et al. Curiosity at Gale Crater, Mars: Characterization and Analysis of the Rocknest Sand Shadow. Science 2013, 341, 1239505. [Google Scholar] [CrossRef]
  4. Grotzinger, J.P.; Sumner, D.Y.; Kah, L.C.; Stack, K.; Gupta, S.; Edgar, L.; Rubin, D.; Lewis, K.; Schieber, J.; Mangold, N.; et al. A Habitable Fluvio-Lacustrine Environment at Yellowknife Bay, Gale Crater, Mars. Science 2014, 343, 1242777. [Google Scholar] [CrossRef]
  5. McEwen, A.S.; Eliason, E.M.; Bergstrom, J.W.; Bridges, N.T.; Hansen, C.J.; Delamere, W.A.; Grant, J.A.; Gulick, V.C.; Herkenhoff, K.E.; Keszthelyi, L.; et al. Mars Reconnaissance Orbiter’s High Resolution Imaging Science Experiment (HiRISE). J. Geophys. Res. Planets 2007, 112, E05S02. [Google Scholar] [CrossRef]
  6. Kirk, R.L.; Howington-Kraus, E.; Rosiek, M.R.; Anderson, J.A.; Archinal, B.A.; Becker, K.J.; Cook, D.A.; Galuszka, D.M.; Geissler, P.E.; Hare, T.M.; et al. Ultrahigh resolution topographic mapping of Mars with MRO HiRISE stereo images: Meter-scale slopes of candidate Phoenix landing sites. J. Geophys. Res. Planets 2008, 113, E00A24. [Google Scholar] [CrossRef]
  7. Keszthelyi, L.; Jaeger, W.; McEwen, A.; Tornabene, L.; Beyer, R.A.; Dundas, C.; Milazzo, M. High Resolution Imaging Science Experiment (HiRISE) images of volcanic terrains from the first 6 months of the Mars Reconnaissance Orbiter Primary Science Phase. J. Geophys. Res. Planets 2008, 113, E04005. [Google Scholar] [CrossRef]
  8. Lefort, A.; Russell, P.S.; Thomas, N.; McEwen, A.S.; Dundas, C.M.; Kirk, R.L. Observations of periglacial landforms in Utopia Planitia with the High Resolution Imaging Science Experiment (HiRISE). J. Geophys. Res. Planets 2009, 114, E04005. [Google Scholar] [CrossRef]
  9. Dundas, C.M.; McEwen, A.S.; Diniega, S.; Byrne, S.; Martinez-Alonso, S. New and recent gully activity on Mars as seen by HiRISE. Geophys. Res. Lett. 2010, 37, L07202. [Google Scholar] [CrossRef]
  10. Lai, W.S.; Huang, J.B.; Ahuja, N.; Yang, M.H. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5835–5843. [Google Scholar] [CrossRef]
  11. Wu, F.L.; Wang, X.J. Example-based super-resolution for single-image analysis from the Chang’e-1 Mission. Res. Astron. Astrophys. 2016, 16, 172. [Google Scholar] [CrossRef]
  12. Zhang, H.; Yang, Z.; Zhang, L.; Shen, H. Super-Resolution Reconstruction for Multi-Angle Remote Sensing Images Considering Resolution Differences. Remote Sens. 2014, 6, 637–657. [Google Scholar] [CrossRef]
  13. Tao, Y.; Muller, J.P. Super-Resolution Restoration of MISR Images Using the UCL MAGiGAN System. Remote Sens. 2019, 11, 52. [Google Scholar] [CrossRef]
  14. Ma, J.; Zhang, L.; Zhang, J. SD-GAN: Saliency-Discriminated GAN for Remote Sensing Image Superresolution. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1973–1977. [Google Scholar] [CrossRef]
  15. Yu, Y.; Li, X.; Liu, F. E-DBPN: Enhanced Deep Back-Projection Networks for Remote Sensing Scene Image Superresolution. IEEE Trans. Geosci. Remote Sens. 2020, 58, 5503–5515. [Google Scholar] [CrossRef]
  16. Wang, Y.H.; Qiao, J.; Li, J.B.; Fu, P.; Chu, S.C.; Roddick, J.F. Sparse representation-based MRI super-resolution reconstruction. Measurement 2014, 47, 946–953. [Google Scholar] [CrossRef]
  17. Lyu, Q.; Shan, H.; Steber, C.; Helis, C.; Whitlow, C.; Chan, M.; Wang, G. Multi-Contrast Super-Resolution MRI through a Progressive Network. IEEE Trans. Med Imaging 2020, 39, 2738–2749. [Google Scholar] [CrossRef]
  18. Wu, Z.; Chen, X.; Xie, S.; Shen, J.; Zeng, Y. Super-resolution of brain MRI images based on denoising diffusion probabilistic model. Biomed. Signal Process. Control 2023, 85, 104901. [Google Scholar] [CrossRef]
  19. Gunturk, B.; Batur, A.; Altunbasak, Y.; Hayes, M.; Mersereau, R. Eigenface-domain super-resolution for face recognition. IEEE Trans. Image Process. 2003, 12, 597–606. [Google Scholar] [CrossRef]
  20. Grm, K.; Scheirer, W.J.; Struc, V. Face Hallucination Using Cascaded Super-Resolution and Identity Priors. IEEE Trans. Image Process. 2020, 29, 2150–2165. [Google Scholar] [CrossRef]
  21. Hou, H.; Xu, J.; Hou, Y.; Hu, X.; Wei, B.; Shen, D. Semi-Cycled Generative Adversarial Networks for Real-World Face Super-Resolution. IEEE Trans. Image Process. 2023, 32, 1184–1199. [Google Scholar] [CrossRef]
  22. Hardie, R.; Barnard, K.; Armstrong, E. Joint MAP registration and high-resolution image estimation using a sequence of undersampled images. IEEE Trans. Image Process. 1997, 6, 1621–1633. [Google Scholar] [CrossRef]
  23. Farsiu, S.; Robinson, M.; Elad, M.; Milanfar, P. Fast and Robust Multiframe Super Resolution. IEEE Trans. Image Process. 2004, 13, 1327–1344. [Google Scholar] [CrossRef]
  24. Yuan, Q.; Zhang, L.; Shen, H. Multiframe Super-Resolution Employing a Spatially Weighted Total Variation Model. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 379–392. [Google Scholar] [CrossRef]
  25. Freeman, W.; Jones, T.; Pasztor, E. Example-based super-resolution. IEEE Comput. Graph. Appl. 2002, 22, 56–65. [Google Scholar] [CrossRef]
  26. Wang, Z.; Chen, J.; Hoi, S.C.H. Deep Learning for Image Super-Resolution: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3365–3387. [Google Scholar] [CrossRef]
  27. Tao, Y.; Conway, S.J.; Muller, J.P.; Putri, A.R.D.; Thomas, N.; Cremonese, G. Single Image Super-Resolution Restoration of TGO CaSSIS Colour Images: Demonstration with Perseverance Rover Landing Site and Mars Science Targets. Remote Sens. 2021, 13, 1777. [Google Scholar] [CrossRef]
  28. Stark, H.; Oskoui, P. High-resolution image recovery from image-plane arrays, using convex projections. J. Opt. Soc. Am. A 1989, 6, 1715–1726. [Google Scholar] [CrossRef]
  29. Patti, A.; Sezan, M.; Tekalp, A.M. Superresolution video reconstruction with arbitrary sampling lattices and nonzero aperture time. IEEE Trans. Image Process. 1997, 6, 1064–1076. [Google Scholar] [CrossRef]
  30. Patti, A.; Altunbasak, Y. Artifact reduction for set theoretic super resolution image reconstruction with edge adaptive constraints and higher-order interpolants. IEEE Trans. Image Process. 2001, 10, 179–186. [Google Scholar] [CrossRef]
  31. Irani, M.; Peleg, S. Improving resolution by image registration. CVGIP Graph. Model. Image Process. 1991, 53, 231–239. [Google Scholar] [CrossRef]
  32. Irani, M.; Peleg, S. Motion Analysis for Image Enhancement: Resolution, Occlusion, and Transparency. J. Vis. Commun. Image Represent. 1993, 4, 324–335. [Google Scholar] [CrossRef]
  33. Schultz, R.; Stevenson, R. A Bayesian approach to image expansion for improved definition. IEEE Trans. Image Process. 1994, 3, 233–242. [Google Scholar] [CrossRef] [PubMed]
  34. Schultz, R.; Stevenson, R. Extraction of high-resolution frames from video sequences. IEEE Trans. Image Process. 1996, 5, 996–1011. [Google Scholar] [CrossRef] [PubMed]
  35. Shen, H.; Zhang, L.; Huang, B.; Li, P. A MAP Approach for Joint Motion Estimation, Segmentation, and Super Resolution. IEEE Trans. Image Process. 2007, 16, 479–490. [Google Scholar] [CrossRef]
  36. Belekos, S.P.; Galatsanos, N.P.; Katsaggelos, A.K. Maximum a Posteriori Video Super-Resolution Using a New Multichannel Image Prior. IEEE Trans. Image Process. 2010, 19, 1451–1464. [Google Scholar] [CrossRef]
  37. Elad, M.; Feuer, A. Restoration of a single superresolution image from several blurred, noisy, and undersampled measured images. IEEE Trans. Image Process. 1997, 6, 1646–1658. [Google Scholar] [CrossRef]
  38. Yang, J.; Wright, J.; Huang, T.S.; Ma, Y. Image Super-Resolution Via Sparse Representation. IEEE Trans. Image Process. 2010, 19, 2861–2873. [Google Scholar] [CrossRef]
  39. Tao, Y.; Xiong, S.; Song, R.; Muller, J.P. Towards Streamlined Single-Image Super-Resolution: Demonstration with 10 m Sentinel-2 Colour and 10–60 m Multi-Spectral VNIR and SWIR Bands. Remote Sens. 2021, 13, 2614. [Google Scholar] [CrossRef]
  40. Tao, Y.; Xiong, S.; Muller, J.P.; Michael, G.; Conway, S.J.; Paar, G.; Cremonese, G.; Thomas, N. Subpixel-Scale Topography Retrieval of Mars Using Single-Image DTM Estimation and Super-Resolution Restoration. Remote Sens. 2022, 14, 257. [Google Scholar] [CrossRef]
  41. Zou, H.; He, S.; Cao, X.; Sun, L.; Wei, J.; Liu, S.; Liu, J. Rescaling-Assisted Super-Resolution for Medium-Low Resolution Remote Sensing Ship Detection. Remote Sens. 2022, 14, 2566. [Google Scholar] [CrossRef]
  42. Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 295–307. [Google Scholar] [CrossRef]
  43. Dong, C.; Loy, C.C.; Tang, X. Accelerating the Super-Resolution Convolutional Neural Network. In Computer Vision—ECCV 2016; Springer International Publishing: Cham, Switzerland, 2016; pp. 391–407. [Google Scholar] [CrossRef]
  44. Shi, W.; Caballero, J.; Huszar, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar] [CrossRef]
  45. Kim, J.; Lee, J.K.; Lee, K.M. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar] [CrossRef]
  46. Kim, J.; Lee, J.K.; Lee, K.M. Deeply-Recursive Convolutional Network for Image Super-Resolution. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1637–1645. [Google Scholar] [CrossRef]
  47. Wang, Z.; Liu, D.; Yang, J.; Han, W.; Huang, T. Deep Networks for Image Super-Resolution with Sparse Prior. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 370–378. [Google Scholar] [CrossRef]
  48. Yang, W.; Feng, J.; Yang, J.; Zhao, F.; Liu, J.; Guo, Z.; Yan, S. Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution. IEEE Trans. Image Process. 2017, 26, 5895–5907. [Google Scholar] [CrossRef]
  49. Tai, Y.; Yang, J.; Liu, X.; Xu, C. MemNet: A Persistent Memory Network for Image Restoration. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4549–4557. [Google Scholar] [CrossRef]
  50. Tai, Y.; Yang, J.; Liu, X. Image Super-Resolution via Deep Recursive Residual Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2790–2798. [Google Scholar] [CrossRef]
  51. Tong, T.; Li, G.; Liu, X.; Gao, Q. Image Super-Resolution Using Dense Skip Connections. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4809–4817. [Google Scholar] [CrossRef]
  52. Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 105–114. [Google Scholar] [CrossRef]
  53. Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual Dense Network for Image Super-Resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2472–2481. [Google Scholar] [CrossRef]
  54. Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Gool, L.V.; Timofte, R. SwinIR: Image Restoration Using Swin Transformer. arXiv 2021, arXiv:2108.10257. [Google Scholar] [CrossRef]
  55. Choi, H.; Lee, J.; Yang, J. N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 2071–2081. [Google Scholar] [CrossRef]
  56. Chen, Z.; Zhang, Y.; Gu, J.; Kong, L.; Yang, X.; Yu, F. Dual Aggregation Transformer for Image Super-Resolution. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 12278–12287. [Google Scholar] [CrossRef]
  57. Tao, Y.; Douté, S.; Muller, J.P.; Conway, S.J.; Thomas, N.; Cremonese, G. Ultra-High-Resolution 1 m/pixel CaSSIS DTM Using Super-Resolution Restoration and Shape-from-Shading: Demonstration over Oxia Planum on Mars. Remote Sens. 2021, 13, 2185. [Google Scholar] [CrossRef]
  58. Tao, Y.; Muller, J.P. Super-Resolution Restoration of Spaceborne Ultra-High-Resolution Images Using the UCL OpTiGAN System. Remote Sens. 2021, 13, 2269. [Google Scholar] [CrossRef]
  59. Delgado-Centeno, J.I.; Sanchez-Cuevas, P.J.; Martinez, C.; Olivares-Mendez, M. Enhancing Lunar Reconnaissance Orbiter Images via Multi-Frame Super Resolution for Future Robotic Space Missions. IEEE Robot. Autom. Lett. 2021, 6, 7721–7727. [Google Scholar] [CrossRef]
  60. Wang, C.; Zhang, Y.; Zhang, Y.; Tian, R.; Ding, M. Mars Image Super-Resolution Based on Generative Adversarial Network. IEEE Access 2021, 9, 108889–108898. [Google Scholar] [CrossRef]
  61. Tewari, A.; Prateek, C.; Khanna, N. In-Orbit Lunar Satellite Image Super Resolution for Selective Data Transmission. arXiv 2021, arXiv:2110.10109. [Google Scholar] [CrossRef]
  62. Zhang, N.; Wang, Y.; Zhang, X.; Xu, D.; Wang, X. An Unsupervised Remote Sensing Single-Image Super-Resolution Method Based on Generative Adversarial Network. IEEE Access 2020, 8, 29027–29039. [Google Scholar] [CrossRef]
  63. Zhang, N.; Wang, Y.; Zhang, X.; Xu, D.; Wang, X.; Ben, G.; Zhao, Z.; Li, Z. A Multi-Degradation Aided Method for Unsupervised Remote Sensing Image Super Resolution With Convolution Neural Networks. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5600814. [Google Scholar] [CrossRef]
  64. Geng, M.; Wu, F.; Wang, D. Lightweight Mars remote sensing image super-resolution reconstruction network. Opt. Precis. Eng. 2022, 30, 1487–1498. [Google Scholar] [CrossRef]
  65. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv 2021, arXiv:2103.14030. [Google Scholar] [CrossRef]
  66. Agustsson, E.; Timofte, R. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1122–1131. [Google Scholar] [CrossRef]
  67. Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi Morel, M.-L. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding. In Proceedings of the British Machine Vision Conference 2012, Surrey, UK, 3–7 September 2012; pp. 135.1–135.10. [Google Scholar] [CrossRef]
  68. Zeyde, R.; Elad, M.; Protter, M. On Single Image Scale-Up Using Sparse-Representations. In Curves and Surfaces; Springer: Berlin/Heidelberg, Germany, 2012; pp. 711–730. [Google Scholar] [CrossRef]
  69. Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour Detection and Hierarchical Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916. [Google Scholar] [CrossRef]
  70. Huang, J.B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 5197–5206. [Google Scholar] [CrossRef]
  71. Matsui, Y.; Ito, K.; Aramaki, Y.; Fujimoto, A.; Ogawa, T.; Yamasaki, T.; Aizawa, K. Sketch-based manga retrieval using manga109 dataset. Multimed. Tools Appl. 2017, 76, 21811–21838. [Google Scholar] [CrossRef]
  72. Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1132–1140. [Google Scholar] [CrossRef]
  73. Ahn, N.; Kang, B.; Sohn, K.A. Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network. In Computer Vision—ECCV 2018; Springer International Publishing: Cham, Switzerland, 2018; pp. 256–272. [Google Scholar] [CrossRef]
  74. Haris, M.; Shakhnarovich, G.; Ukita, N. Deep Back-Projection Networks for Super-Resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1664–1673. [Google Scholar] [CrossRef]
  75. Li, J.; Fang, F.; Mei, K.; Zhang, G. Multi-scale Residual Network for Image Super-Resolution. In Computer Vision—ECCV 2018; Springer International Publishing: Cham, Switzerland, 2018; pp. 527–542. [Google Scholar] [CrossRef]
  76. Hui, Z.; Gao, X.; Yang, Y.; Wang, X. Lightweight Image Super-Resolution with Information Multi-distillation Network. In Proceedings of the Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2024–2032. [Google Scholar] [CrossRef]
  77. Wang, C.; Li, Z.; Shi, J. Lightweight Image Super-Resolution with Adaptive Weighted Learning Network. arXiv 2019, arXiv:1904.02358. [Google Scholar] [CrossRef]
  78. Liu, J.; Tang, J.; Wu, G. Residual Feature Distillation Network for Lightweight Image Super-Resolution. In Computer Vision—ECCV 2020 Workshops; Springer International Publishing: Cham, Switzerland, 2020; pp. 41–55. [Google Scholar] [CrossRef]
  79. Luo, X.; Xie, Y.; Zhang, Y.; Qu, Y.; Li, C.; Fu, Y. LatticeNet: Towards Lightweight Image Super-Resolution with Lat-tice Block. In Computer Vision—ECCV 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 272–289. [Google Scholar] [CrossRef]
  80. Fang, J.; Lin, H.; Chen, X.; Zeng, K. A Hybrid Network of CNN and Transformer for Lightweight Image Super-Resolution. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 1102–1111. [Google Scholar] [CrossRef]
  81. Zhang, X.; Zeng, H.; Guo, S.; Zhang, L. Efficient Long-Range Attention Network for Image Super-Resolution. In Computer Vision—ECCV 2022; Springer Nature: Cham, Switzerland, 2022; pp. 649–667. [Google Scholar] [CrossRef]
  82. Lu, Z.; Li, J.; Liu, H.; Huang, C.; Zhang, L.; Zeng, T. Transformer for Single Image Super-Resolution. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 456–465. [Google Scholar] [CrossRef]
  83. Zhang, A.; Ren, W.; Liu, Y.; Cao, X. Lightweight Image Super-Resolution with Superpixel Token Interaction. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 12682–12691. [Google Scholar] [CrossRef]
  84. Behjati, P.; Rodriguez, P.; Fernández, C.; Hupont, I.; Mehri, A.; Gonzàlez, J. Single image super-resolution based on directional variance attention network. Pattern Recognit. 2023, 133, 108997. [Google Scholar] [CrossRef]
  85. Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
Figure 1. RSTSRN network architecture.
Figure 1. RSTSRN network architecture.
Applsci 14 09286 g001
Figure 2. Multi-scale training process.
Figure 2. Multi-scale training process.
Applsci 14 09286 g002
Figure 3. Shifted window.
Figure 3. Shifted window.
Applsci 14 09286 g003
Figure 4. Examples of HiRISE images.
Figure 4. Examples of HiRISE images.
Applsci 14 09286 g004
Figure 5. Visual comparison for 2×SR on the barbara.
Figure 5. Visual comparison for 2×SR on the barbara.
Applsci 14 09286 g005
Figure 6. Visual comparison for 4× SR on the butterfly.
Figure 6. Visual comparison for 4× SR on the butterfly.
Applsci 14 09286 g006
Figure 7. Visual comparison for 8× SR on the img_040.
Figure 7. Visual comparison for 8× SR on the img_040.
Applsci 14 09286 g007
Figure 8. The 2× SR results of the 54th Mars image.
Figure 8. The 2× SR results of the 54th Mars image.
Applsci 14 09286 g008
Figure 9. The 4× SR results of the 59th Mars image.
Figure 9. The 4× SR results of the 59th Mars image.
Applsci 14 09286 g009
Figure 10. The 8× SR results of the 26th Mars image.
Figure 10. The 8× SR results of the 26th Mars image.
Applsci 14 09286 g010
Table 1. Average PSNR (dB)/SSIM for 2×, 4× and 8× SR. ’Year’ indicates the publication year of each paper. Red indicates the best performance and blue indicates the second best.
Table 1. Average PSNR (dB)/SSIM for 2×, 4× and 8× SR. ’Year’ indicates the publication year of each paper. Red indicates the best performance and blue indicates the second best.
AlgorithmScaleYearParametersSet5
PSNR/SSIM
Set14
PSNR/SSIM
BSDS100
PSNR/SSIM
Urban100
PSNR/SSIM
MANGA109
PSNR/SSIM
Bicubic--33.66/0.928430.34/0.867529.57/0.843426.88/0.843830.82/0.9332
VDSR [45]2016665K37.53/0.958733.03/0.912431.90/0.896030.76/0.914037.22/0.9729
DRCN [46]20161774K37.63/0.958833.04/0.911831.85/0.894230.75/0.913337.63/0.9723
MemNet [49]2017677K37.78/0.959733.28/0.914232.08/0.897831.31/0.919537.72/0.9740
LapSRN [10]2017813K37.52/0.958133.08/0.910931.80/0.894930.41/0.911237.27/0.9745
EDSR [72]201743000K38.11/0.960133.92/0.919532.32/0.901332.93/0.935139.10/0.9773
CARN [73]20181592K37.76/0.959033.52/0.916632.09/0.897831.51/0.931238.36/0.9764
D-DBPN [74]20185819K38.09/0.960033.85/0.919032.27/0.900032.55/0.932438.89/0.9775
MSRN [75]20185930K38.08/0.960533.74/0.917032.23/0.901332.22/0.932638.69/0.9772
IMDN [76]2019694K38.00/0.960533.63/0.917732.19/0.899632.17/0.928338.88/0.9774
AWSRN [77]20191397K38.11/0.960833.78/0.918932.26/0.900632.49/0.931638.87/0.9776
RFDN-L [78]2020626K38.08/0.960633.67/0.919032.18/0.899632.24/0.929038.95/0.9773
LatticeNet [79]2020756K38.15/0.961033.78/0.919332.25/0.900532.43/0.9302-
HNCT [80]2022356K38.08/0.960833.65/0.918232.22/0.900132.22/0.929438.87/0.9774
ELAN-light [81]2022582K38.17/0.961133.94/0.920732.30/0.901232.76/0.934039.11/0.9782
ESRT [82]2022677K38.03/0.960033.75/0.918432.25/0.900132.58/0.931839.12/0.9783
SPIN [83]2023497K38.20/0.961533.90/0.921532.31/0.901532.79/0.934039.18/0.9784
DiVANet [84]2023902K38.16/0.961233.80/0.919532.29/0.901232.60/0.932539.08/0.9775
NGswin [55]2023998K38.05/0.961033.79/0.919932.27/0.900832.53/0.932438.97/0.9777
RSTSRN (ours)-1830K38.11/0.961933.92/0.920932.30/0.901232.68/0.934039.04/0.9779
Bicubic--28.43/0.802226.10/0.693625.97/0.651723.14/0.659924.91/0.7826
VDSR [45]2016665K31.35/0.883828.01/0.767427.29/0.725125.18/0.752428.83/0.8809
DRCN [46]20161774K31.53/0.884028.04/0.770027.24/0.724025.14/0.752028.97/0.8860
MemNet [49]2017677K31.74/0.889328.26/0.772327.40/0.728125.50/0.763029.42/0.8942
LapSRN [10]2017813K31.54/0.885628.19/0.772027.32/0.728025.21/0.756029.09/0.8900
EDSR [72]201743000K32.46/0.896828.80/0.787627.71/0.742026.64/0.803331.02/0.9148
CARN [73]20181592K32.13/0.893728.60/0.780627.58/0.734926.07/0.783730.36/0.9082
D-DBPN [74]201810291K32.47/0.897028.82/0.786027.72/0.740026.38/0.794630.91/0.9137
MSRN [75]20186078K32.07/0.890328.60/0.775127.52/0.727326.04/0.789630.17/0.9034
IMDN [76]2019715K32.21/0.894828.58/0.781127.56/0.735326.04/0.783830.45/0.9075
AWSRN [77]20191587K32.27/0.896028.69/0.784327.64/0.738526.29/0.793030.72/0.9109
RFDN-L [78]2020643K32.28/0.895728.61/0.781827.58/0.736326.20/0.788330.61/0.9096
LatticeNet [79]2020777K32.30/0.896228.68/0.783027.62/0.736726.25/0.7873-
HNCT [80]2022372K32.31/0.895728.71/0.783427.63/0.738126.20/0.789630.70/0.9112
ELAN-light [81]2022601K32.43/0.897528.78/0.785827.69/0.740626.54/0.798230.92/0.9150
ESRT [82]2022751K32.19/0.894728.69/0.783327.69/0.737926.39/0.796230.75/0.9100
SPIN [83]2023555K32.48/0.898328.80/0.786227.70/0.741526.55/0.799830.98/0.9156
DiVANet [84]2023939K32.41/0.897328.70/0.784427.65/0.739126.42/0.795830.73/0.9119
NGswin [55]20231019K32.33/0.896328.78/0.785927.66/0.739626.45/0.796330.80/0.9128
RSTSRN (ours)-1830K32.63/0.900828.89/0.789527.77/0.743526.78/0.807531.40/0.9192
Bicubic--24.40/0.604523.19/0.511023.67/0.480820.74/0.484121.46/0.6138
VDSR [45]2016665K25.73/0.674323.20/0.511024.34/0.516921.48/0.528922.73/0.6688
DRCN [46]20161774K25.93/0.674324.25/0.551024.49/0.516821.71/0.528923.20/0.6686
MemNet [49]-2017------
LapSRN [10]2017813K26.15/0.702824.45/0.579224.54/0.529321.81/0.555523.39/0.7068
EDSR [72]201743000K26.96/0.776224.91/0.642024.81/0.598522.51/0.622124.69/0.7841
CARN [73]-2018------
D-DBPN [74]201823071K27.21/0.784025.13/0.648024.88/0.601022.73/0.631225.14/0.7987
MSRN [75]20186226K26.59/0.725424.88/0.596124.70/0.541022.37/0.597724.28/0.7517
IMDN [76]-2019------
AWSRN [77]20192348K26.97/0.774724.99/0.641424.80/0.596722.45/0.617424.60/0.7782
RFDN-L [78]-2020------
LatticeNet [79]-2020------
HNCT [80]-2022------
ELAN-light [81]-2022------
ESRT [82]-2022------
SPIN [83]-2023------
DiVANet [84]-2023------
NGswin [55]-2023------
RSTSRN (ours)-1830K27.30/0.789525.24/0.649824.96/0.604523.02/0.645825.23/0.8022
Table 2. Quantitative comparison with the number of parameters of lightweight SR algorithms on multi-scale tasks. ’Year’ indicates the publication year of each paper.
Table 2. Quantitative comparison with the number of parameters of lightweight SR algorithms on multi-scale tasks. ’Year’ indicates the publication year of each paper.
MethodYearMulti-Scales
VDSR [45]2016665 K665 K665 K1995 K
DRCN [46]20161774 K1774 K1774 K5322 K
MemNet [49]2017677 K677 K-1354 K
LapSRN [10]2017813 K813 K813 K2439 K
CARN [73]20181592 K1592 K-3184 K
AWSRN [77]20191397 K1587 K2348 K5332 K
RSTSRN(ours)-1830 K1830 K1830 K1830 K
Table 3. Quantitative comparison of 2×, 4× and 8×SR on Mars remote sensing datasets.
Table 3. Quantitative comparison of 2×, 4× and 8×SR on Mars remote sensing datasets.
MethodScalePSNR/dBSSIM
Bicubic34.090.8228
LapSRN [10]34.720.8477
RSTSRN(ours)35.070.8525
Bicubic31.310.7124
LapSRN [10]31.940.7396
RSTSRN(ours)32.820.7510
Bicubic28.600.6128
LapSRN [10]29.230.6327
RSTSRN(ours)30.450.6638
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, F.; Jiang, X.; Fu, T.; Fu, Y.; Xu, D.; Zhao, C. RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images. Appl. Sci. 2024, 14, 9286. https://doi.org/10.3390/app14209286

AMA Style

Wu F, Jiang X, Fu T, Fu Y, Xu D, Zhao C. RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images. Applied Sciences. 2024; 14(20):9286. https://doi.org/10.3390/app14209286

Chicago/Turabian Style

Wu, Fanlu, Xiaonan Jiang, Tianjiao Fu, Yao Fu, Dongdong Xu, and Chunlei Zhao. 2024. "RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images" Applied Sciences 14, no. 20: 9286. https://doi.org/10.3390/app14209286

APA Style

Wu, F., Jiang, X., Fu, T., Fu, Y., Xu, D., & Zhao, C. (2024). RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images. Applied Sciences, 14(20), 9286. https://doi.org/10.3390/app14209286

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop