Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data

Lu, Chenyan; Su, Cheng

doi:10.3390/rs17132115

Open AccessArticle

Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data

by

Chenyan Lu

and

Cheng Su

^*

School of Earth Sciences, Zhejiang University, Hangzhou 310027, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(13), 2115; https://doi.org/10.3390/rs17132115

Submission received: 11 April 2025 / Revised: 7 June 2025 / Accepted: 18 June 2025 / Published: 20 June 2025

(This article belongs to the Section AI Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

As the planet most similar to Earth in the solar system, Mars holds an important role in exploring significant scientific problems, such as the evolution of the solar system and the origins of life. Research on Mars mainly rely on planetary remote sensing technology, among which thermal infrared remote sensing is of great studying significance. This technology enables the recording of Martian thermal radiation properties. However, the current spatial resolution of Mars thermal infrared remote sensing images remains relatively low, limiting the detection of fine-scale thermal anomalies and the generation of higher-precision surface compositional maps. While updating extraterrestrial exploration satellites can help enhancing the spatial resolution of thermal infrared images, this method entails high cost and long update cycles, making improvement difficult to conduct in the short term. To address this issue, this paper proposes a super-resolution reconstruction method for Mars thermal infrared remote sensing images integrating multi-source data. First, based on the principle of domain adaptation, we introduced a method using highly correlated visible light images as auxiliary to enhance the spatial resolution of thermal infrared images. Then, a multi-sources data integration method is designed to constrain the thermal radiation flux of resulting images, ensuring the radiation distribution remains consistent with the original low-resolution thermal infrared images. Through both subjective and objective evaluations, our method is demonstrated to significantly enhance the spatial resolution of existing Mars thermal infrared images. It optimizes the quality of existing data, increasing the resolution of the original thermal infrared images by four times. In doing so, it not only recovers finer texture details to produce better visual effects than typical super-resolution methods, but also maintains the consistency of thermal radiation flux, with the error after applying the consistency constraint reduced by nearly tenfold, ensuring the applicability of the results for scientific research.

Keywords:

super-resolution; deep learning; thermal infrared image; Mars

1. Introduction

Mars is the closest potentially habitable planet to Earth and the most similar planet to Earth in the solar system. Research on Mars holds great significance for exploring the evolution of the solar system, investigating the origin of life, and expanding the space of human habitation [1,2,3,4,5]. Since the launch of the first Mars probe satellite in 1960, extensive researches have been conducted on various aspects of the planet, including its geomorphic units [6], subsurface hydrothermal systems [7] and environmental evolution [8]. Among this research, planetary remote sensing technology is an indispensable tool for exploration. In particular, thermal infrared remote sensing, due to its capability of detecting thermal radiation properties, has been extensively utilized in identifying buried water ice and subsurface hydrothermal systems, retrieving surface temperature distribution, and searching for traces of life existence [7,9,10], thus bearing crucial research significance and scientific value.

Launched in 2001, the Thermal Emission Imaging System (THEMIS) currently provides the highest-resolution thermal infrared remote sensing images for all over Mars. THEMIS data has been widely used for monitoring surface thermal anomalies and extracting characteristic geomorphic features [9]. At spatial scales ranging from hundreds of meters to kilometers, it has contributed to numerous significant scientific discoveries, such as the investigating of water ice deposits [11], the identification of gullies likely formed by melting glacial ice [12], and the generation of surface compositional maps, thermal inertia maps. However, the highest spatial resolution of THEMIS is limited to 100 m, which restricts the detection of fine-scale thermal anomalies and hinders the generation of high-precision thermal inertia maps [13]. Most THEMIS-related compositional analysis of the Martian surface rely on spectral deconvolution techniques, requiring spectral libraries to unmix pixel-level signals and estimate mineral abundances [14]. The accuracy of such methods is often affected by the limitations of the spectral libraries (e.g., the range and grain size of included minerals, signal-to-noise ratio). What is more, high-resolution visible imagery is often required to aid in the detection and interpretation of smaller geological structures [9], a resolution more comparable to that of Earth-based imagery allows geologists to better interpret Martian geological activity by referencing surface processes observed on Earth. While hardware upgrades could enhance image quality, such improvements require long development cycles and substantial costs [3]. Therefore, scholars have turned to Super Resolution Reconstruction (SRR) methods as an alternative approach to improve its spatial resolution, aiming to fully exploit the value of existing data

Current super-resolution reconstruction methods for remote sensing images can be divided into two categories: the first one relies solely on original images, this kind of method enhances image quality by extracting features in either frequency domain or spatial domain and applying post-processing techniques. However, when applied to Mars thermal infrared remote sensing images, this kind of method struggles to effectively generate refined details due to lack of higher-resolution ground truth reference [15]. The second one introduces higher-resolution reference images, and this kind of method establishes mapping relationships between the features of original images and higher-resolution target images, using this relationship to achieve super-resolution reconstructions. This kind of method can effectively improve the visual quality of resulting images, but it primarily focuses on perceptual enhancement, ignoring the data fidelity (e.g., gray scale distribution, pixel value) contained in the image. The availability of high-resolution visible light images of Mars enables the application of this method, however, discrepancies in thermal radiation properties between the output results and the original thermal infrared images may compromise their reliability for scientific analysis and thermal-related studies [16,17].

To generate high-resolution and scientifically reliable results, this paper proposes a super-resolution reconstruction method for Mars thermal infrared remote sensing images that integrates multi-sources data. First, a domain-adaptive network is employed to introduce highly correlated high-resolution visible light references, enhancing the spatial resolution of the original thermal infrared images. Building on this, a thermal radiation property constrained method is applied to ensure consistency in thermal radiation flux between the output and original images. By preserving both spatial detail and thermal radiometric accuracy, the proposed method enhances the reliability of super-resolution reconstruction, making the output results more suitable for scientific research related to thermal radiation properties.

2. Related Works

Super-resolution reconstruction (SRR) is a process of generating high-resolution (HR) images from low-resolution (LR) images based on existing image features. In this process, the LR image can be regarded as a degeneration of the HR image, modeled through blur kernel and degradation function. Therefore, super-resolution reconstruction process can also be regarded as a numerical simulation process, this process aims to optimize the parameters of the reconstructed model, so that the generated super-resolved image can approximate the original high-resolution image to the greatest extent [17]. Super-resolution reconstruction methods of remote sensing images can be divided into two categories [18]: (1) the reconstruction-based method; (2) the learning-based method.

2.1. Reconstruction-Based Method

Reconstruction-based SRR methods manually extract features from the original LR image and enhance them based on mathematical models and constraints, thereby improving spatial resolution. These methods can be divided into two categories: frequency domain-based SRR methods and spatial-domain based SRR methods.

Frequency-domain based methods transform the original LR image from the spatial domain to the frequency domain and use spectrum analysis techniques to enhance the frequency components. A common method involves applying discrete wavelet transform to decompose remote sensing images, then interpolating on the transformed frequency domain to reconstruct its resolution [19]. Other methods include using recursive weighted least-squares algorithm in the wavenumber domain to reconstruct high-resolution images from multiple low-resolution images with relevant content [20].

Spatial domain-based methods enhance resolution by utilizing the internal structures and textures of the original images. The most widely used approaches include spatial interpolation and convex set projection. Spatial interpolation is a commonly used discrete data resampling method that calculates pixel values at unknown locations based on the similarity principle of adjacent known pixel values [21]. Common interpolation methods include bilinear interpolation [22], bicubic interpolation [23], boundary-guided interpolation method [24] and filter-mask based interpolation method [25]. Convex set projection method is an optimization method iteratively solving problems based on convex constraints. This method assumes the existence of an initial closed convex set which contains all the mapping relationships between LR and HR images. By iteratively using this prior relationship, the projection matrix of the original image is estimated, leading to the improvement of the resolution [26,27,28].

Reconstruction-based SRR methods perform calculations only based on the inner information of low-resolution original images such as pixel intensity, spatial structures and textures, these formations are extremely dependent on local features. Without external reference, these methods struggle to reconstruct reliable delicate textures and high-frequency information, reducing their effectiveness for large-scale applications. This drawback is particularly significant in Mars remote sensing, where capturing fine surface details and large-scale scenes is essential for accurate geological and thermal environmental analysis.

2.2. Learning-Based Method

Learning-based SRR methods automatically establish the correspondence between original LR image and HR reference image, and enhancing the spatial resolution of the original image through learned feature mappings. Common learning-based methods include dictionary learning [18] and spare representation [29]. Deep learning is another rapidly developed learning-based method, through the powerful fitting capabilities of neural network, it has become one of the most extensively studied methods. Deep learning-based methods can be divided into two categories: paired images-based methods and unpaired images-based methods, depending on the availability of corresponding LR-HR image pairs for training.

Paired image-based methods are the most conventional deep learning approaches, training neural networks using strictly aligned LR-HR image pairs. By leveraging the fitting capabilities of neural networks, these methods learn and establish direct mapping relationships between high- and low-resolution images. Paired image-based methods can be broadly classified into Convolutional Neural Network (CNN)-based methods [30] and Generative Adversarial Network (GAN)-based methods. CNN-based methods improve reconstruction accuracy through various architectural optimizations [31,32], for example: introducing residual modules to solve gradient disappearance [33,34,35]; introducing dense connection modules and recursive modules to enhance the data expressiveness without increasing the number of parameters [36,37]; using attention modules to calculate interdependence between channels [38]. GAN-based super-resolution reconstruction network contains a generative network and a discriminative network [31]. These kinds of methods improve the similarity judgments beyond normal CNN-based approaches using adversarial loss and content loss, thereby mitigating high-frequency detail loss typically observed in CNN-based super-resolution models [39,40].

Unpaired images-based methods using domain translation principle to reconstruct the original LR images [17], these methods do not require strictly correspondence image pairs, thus reduce the dependency on high-resolution training data. Unpaired images-based methods continuously adapt and learn feature mappings between the target HR image and the original LR image, while also ensuring consistency in feature distribution between the original and resulting images [41]. A notable example is Cycle-in-Cycle Generative Adversarial Networks (CinCGAN), introduced by Yuan et al. [42]. CinCGAN uses two domain adaptation loops to reach the goal of super-resolution reconstruction: the internal loop solves the degradation problem and achieves conversion from LR images to clean LR images (images without noise), while the external loop solves the super-resolution problem and achieves conversion from clean LR images to HR images.

Deep learning-based SRR method exhibit the black-box nature of neural networks, making the reconstructed image lacking interpretability. This challenge is particularly significant in remote sensing images scenes, where ensuring radiometric consistency before and after super-resolution reconstruction is crucial. When applied to Mars thermal infrared images, these methods struggle to maintain thermal radiation flux consistency, limiting their reliability for subsequent scientific analysis.

3. Methods

Current super-resolution reconstruction methods might confront two main challenges when applied to Mars thermal infrared remote sensing images: first, it is difficult to generate reliable high-resolution texture details. There are no higher-resolution thermal infrared images on the Martian surface, so it is impossible to obtain real and credible references; Second, it fails to ensure the consistency of thermal radiation flux between the resulting image and the original image. Pixel values in different bands of remote sensing images represent the reflectance of a specific wavelength, super-resolution reconstruction destroys the physical meanings implicit in it.

One possible solution towards the first challenge is using images of other bands as references. The surface of Mars has no vegetation and water, the atmosphere is thin, therefore, it has a strong correlation between visible light and thermal infrared remote sensing images of the same area. For the second challenge, it is possible to use the original thermal infrared images as constrains to control the resulting images so that the total pixel values remain consistent across the same spatial coverage, thus ensuring the accuracy of the thermal radiation flux in the resulting images. Building on the analysis above, this paper proposes a super-resolution reconstruction method for Mars thermal infrared remote sensing images. First, we improve the spatial resolution of thermal infrared images based on the domain adaptation principle. By leveraging strongly correlated visible light textures in the same region as references, we achieve a fourfold upsampling of the original thermal infrared images. Then, on this basis, we use original thermal infrared images to constrain the thermal radiation flux in the resulting images, ensuring the radiation distribution is consistent with the original image.

The overall method includes two steps: (1) super-resolution reconstruction of thermal infrared (IR) images with visible light (VIS) texture reference; (2) consistency constraints on thermal radiation flux, as shown in Figure 1. This super-resolution framework is designed in a step-by-step, tightly integrated manner, where each component serves a specific part in the overall data flow. The first step includes two Upsampling Modules, which are repeatedly applied. Each network twice upsampled the original image LR using bicubic interpolation, then used a generated LR_bicubic to obtain the high-resolution pseudo-thermal infrared image pseudo-IR. Each upsampling module is an adversarial generative network designed based on the principle of domain adaptation, including a forward cycle generator, a backward cycle generator, and two discriminators attached to each generator, referring to CycleGAN. The detailed structure will be introduced in Section 3.1. The second step uses three neural networks to constrain the thermal radiation flux. Intensity-extraction Network and Gradient-extraction Network separately extract the intensity component (low-frequency background and grayscale distribution information) in the amplified original image LR_upscale and the gradient component (high-frequency boundary information) in HR-pseudo. LR_upscale is the result of directly amplifying each pixel of the original image by four times, HR-pseudo is the result of first step. These two components are then encoded by Fusion Network to get the final high-resolution thermal infrared image Output.

3.1. Super-Resolution Reconstruction of Thermal Infrared Images with Visible Light Texture Reference

This step is designed to obtain the high-resolution pseudo-thermal infrared image (HR-pseudo). HR-pseudo is generated by progressively transforming the original thermal infrared image into the visible domain through two upsampling modules, thereby incorporating higher-resolution textures. The schematic diagram of the first step is shown in Figure 1a. Each upsampling module has the same structure: two generators and two discriminators attached to each generator, two generators forming a ring structure, constituting a forward cycle, and a backward cycle (shows in Figure 2). In each module, generators are used to transfer the style of textures shown in the input image (RealIR_bicubic and RealVIS in Figure 2); discriminators are used to judge the difference between the resulting image (FakeIR) and the visible light label (RealVIS_bicubic). We employ the forward cycle to transform the style from the thermal infrared domain to the visible light domain, thereby generating the pseudo-thermal infrared image pseudo-IR.

Figure 2 illustrates the detailed process of the forward cycle. In this process, the original thermal infrared image is first upsampled using bicubic interpolation, producing RealIR_bicubic; this upsampled image then passes through the forward cycle generator to obtain the pseudo thermal infrared image FakeIR. FakeIR is the output of this step and will be used as the input of following step, and is also the pseudo-IR shown in Figure 1a. RecycleIR is an image obtained by putting FakeIR into the backward cycle generator; this image is in the thermal infrared domain, all the while maintaining the content of FakeIR by comparing this image with RealIR_bicubic. It serves as a constraint to maintain the identity consistency of FakeIR. Two discriminators separately evaluate the characteristic and the gradient characteristic of FakeIR. The discriminator evaluates whether the characteristic of FakeIR is consistent with the original visible light reference RealVIS by simultaneously inputting both images into the network. Meanwhile, the gradient discriminator process images after calculating their high frequency details using the Laplacian operator (obtained images are FakeIR_gradient and the RealVIS_gradient). This discriminator is used to determine whether the high-frequency distribution detail of the resulting image is consistent with the original visible light label, hence highlighting the detail components in the resulting image.

The detailed structures of the generators and discriminators in Figure 2 are shown in Figure 3 and Figure 4. The discriminator is a Markov discriminator (PatchGAN). Different from the general binary discriminator, whose output is only true/false, its result is judged in the form of a matrix, allowing the discriminator to consider the spatial information of different positions in the input image.

The complete loss function in this module consists of three parts: adversarial loss (

{L o s s}_{G A N}, {L o s s}_{G A N_G R A D I E N T}

), cycle consistency loss (

{L o s s}_{c y c l e}

), and identity mapping loss (

{L o s s}_{i d e n t i t y}

). Their calculation positions are shown in Figure 2. Taking the forward cycle as an example, the complete loss function is shown in Formula (1).

L o s s = {L o s s}_{G A N} + {L o s s}_{G A N_G R A D I E N T} + {L o s s}_{c y c l e} + {L o s s}_{i d e n t i t y}

(1)

The detailed adversarial loss is shown in Formulas (2) and (3). The input image LR_bicubic is denoted as x, the visible light label RealVIS is denoted as y.

G_{1}

represents the forward cycle generator, and

G_{2}

represents the backward cycle generator.

G_{1} (x)

represents the output results of image x after passing through the forward cycle generator,

l o g (D (y)

) represents the output result after

y

being calculated by the discriminator, and

E_{y}

represents the expectation over the

y

data distribution. Maximizing the value of

E_{y} [\log D (y)]

and maximizing the value of

E_{x} [\log (1 - D (G_{1} (x)))]

to train the discriminator to the greatest extent.

G r a d (x)

represents the gradient calculation of the image using the Laplacian operator.

{L o s s}_{G A N} = E_{y} [\log D (y)] + E_{x} [\log (1 - D (G_{1} (x)))]

(2)

{L o s s}_{G A N_G A R D I E N T} = E_{y} [\log D (G r a d (y))] + E_{x} [\log (1 - D (G_{1} (G r a d (x))))]

(3)

The detailed cycle consistency loss is shown in Formula (4). The RecycleIR obtained after the forward and backward generator is compared with the original thermal infrared image RealIR to obtain this loss. It is used to maintain the consistency of the image content after forward and backward cycles.

{L o s s}_{c y c l e} = {‖G_{2} (G_{1} (x)) - x‖}_{2}

(4)

The detailed identity mapping loss is shown in Formula (5). The original visible light image RealVIS passes into the backward cycle generator and its similarity with itself is calculated, which is used to control the consistency of the image style and hue.

{L o s s}_{i d e n t i t y} = {‖G_{2} (x) - x‖}_{2}

(5)

3.2. Consistency Constraints on Thermal Radiation Flux

This step is designed to constrain the thermal radiation properties of the intermediate result (HR-pseudo) generated in the previous stage. As shown within the green dashed box in Figure 1, although HR-pseudo exhibits enhanced spatial resolution, the super-resolution reconstruction process inevitably alters the original grayscale values of the thermal infrared image, potentially compromising the accuracy of the embedded radiation information. To address this, this step introduces consistency constraints based on the original low-resolution thermal image. Specifically, it extracts gradient components from HR-pseudo and intensity components from the original thermal infrared image LR to ensure that the radiation distribution in the final high-resolution thermal infrared output remains consistent with the physical properties of the source data. This constraint-driven enhancement process is implemented through a dedicated thermal radiation flux constraint module, which consists of three sub-parts: the Intensity-Extraction Network, the Gradient-Extraction Network, and the Fusion Network. The detailed architecture of these three sub-networks is illustrated in Figure 5.

3.2.1. Intensity-Extraction Network

The Intensity-extraction Network is used to extract the thermal radiation flux information in the LR-upscale (original image directly amplified four times) and restrict the same information in the resulting image Output. The backbone of this network is UNet. UNet can fully consider the feature information of different scales, and it has a better performance in dealing with the raw thermal infrared image with less high-frequency information.

The intensity loss function LOSS_intensity used to train this network is designed based on the ideas of sliding window and multi-layer convolution (shown in Figure 6). The sliding window ensures that the sum of pixel values within the same coverage area remains consistent between LR-upscale and the resulting image Output. Multi-layer convolution constrains the sum of pixel values at different scales to be consistent.

Assuming n-fold up-sampling of the original thermal infrared image is applied, after conducting this, the sum of the pixel values in a random n × n cell of the resulting high-resolution image should be the same as the value of one matching pixel in the original image. This consistency embodies the consistency of the thermal radiation fluxes within the same coverage, which can be expressed by Equation (6). The goal of the equation is making

ε

equals 0 as much as possible. In this paper, the mapping process can be regarded as a sliding-window resampling process, three convolution operators are performed simultaneously on LR-upscale and Output, with convolution kernel sizes of n × n, (n + 1) × (n + 1), and (n + 1) × (n + 1) (n is taken to be 4, the same as the upsampling factor), as shown in Figure 6. The Loss is then calculated for the output of each layer, resulting in three intensity information loss functions calculated at different scales: Loss1-1, Loss1-2 and Loss1-3. These three losses are then summed up to obtain the final intensity loss function.

ε = \sum_{i = 1}^{n} \sum_{j = 1}^{n} {L R_u p s c a l e}_{i j} - \sum_{i = 1}^{n} \sum_{j = 1}^{n} {O u t p u t}_{i j}

(6)

3.2.2. Gradient-Extraction Network

The Gradient-extraction Network is used to extract the high-frequency boundary information in HR-pseudo (thermal infrared image with visible light texture) and restrict the same information in the resulting image Output. This network is a multi-scale combined network, which can fuse the clear semantic details from the lower layers with the overall information from the higher layers. The network consists of five basic down-sampling modules, the output of each down-sampling module is then passed through an up-sampling module to make all the intermediate images to the same size. The average of all outputs is then calculated and added with all the previous individual outputs to obtain the final six-channel image.

The gradient loss function used to train this network is designed based on the idea of sharpening filter. This loss function extracts the edge information by calculating second-order differential of the image. The formulation of calculation process is shown in Equations (7)–(9). f is the original image, G(x) is the second-order differential in the horizontal axis direction, G(y) is the second-order differential in the vertical axis direction. Compared with the first-order differential, the second-order differential has a dual response to changes in gray scale and can retain information such as the illumination direction. The second-order differential has a stronger step response to discrete point gradients, The second-order differential has a stronger step response to discrete point gradients, so it can obtain better detail information in image enhancement. In practice, a 3 × 3 convolution kernel with a central pixel value of 8 and surrounding pixel values of −1 is used, the sliding window method is then adopted to calculate the difference between adjacent pixels and realize gradient extraction.

G (x) = \frac{\partial^{2} f}{\partial x^{2}} = f (x + 1, y) + f (x - 1, y) - 2 f (x, y)

(7)

G (y) = \frac{\partial^{2} f}{\partial y^{2}} = f (x, y + 1) + f (x, y - 1) - 2 f (x, y)

(8)

\nabla^{2} = G (x) + G (y)

(9)

3.2.3. Fusion Network

The fusion network is used to fuse the intensity extraction results and the gradient extraction results. The outputs of the above two networks are channel-connected, and then through convolution, batch normalization, and ReLU layers to obtain the final result Output. The loss function used to train this module is a combination of the loss functions from the aforementioned two networks.

4. Experiment

4.1. Study Area

The study area in this article spans 29.19–33.48°N and 84.28–89.85°E (shown in the red frame in Figure 7) on the surface of Mars, located on the west side of South Utopia Planitia (SUP). This area is close to the dividing line between the southern highlands and the northern plains, and thus has rich terrain units. This paper divides all the terrain units in this area into four categories: crater, smooth surface, rough surface, and ridge.

Utopia Planitia is the largest impact basin currently known on Mars [43]. Some studies believe the large-scale fan-shaped units in this area may be derived from the degraded permafrost [44,45]. Therefore, this area is believed to have once had an environment suitable for life and has been extensively studied. In 1976, the landing site of Viking 2 rover was selected as South Utopia Planitia. In May 2020, China’s first Mars probe “Tianwen-1” also landed in this area. Tianwen-1 carries the “Zhurong” Mars rover, this vehicle cruiser is now further exploring the geological and geomorphological environment of this area.

In order to construct a fulfilled dataset containing all kinds of terrain units, this paper also selected 5 supplementary sample selection areas to expand the training dataset (shown in the black frame in Figure 7). These areas are in four locations Olympus Mons, Nolisyrtis Mensae, Terra Sabaea, and Utopia Planitia. They are evenly distributed across the Mars along the longitude, and include all four categories of terrain units this paper divided.

4.2. Dataset

The thermal infrared images used in this paper are sourced from the Thermal Emission Imaging System (THEMIS) data product in NASA’s Planetary Data System, and the visible light images selected for this study are the Mars Context Imager (CTX) data product (https://ode.rsl.wustl.edu/mars/mapsearch (16 November 2022)) on board the Mars Reconnaissance Orbiter (MRO). The detailed parameters of these two data sources are shown in Table 1.

The THEMIS data product is currently the highest-resolution thermal infrared remote sensing image available for Mars, and is also the most widely used thermal infrared data product for Mars. THEMIS is also the name of a multispectral thermal infrared imager, its 12.57 μm band providing a 100-m resolution Mars global mosaic for both daytime and nighttime thermal infrared images. This product is extensively utilized for surface thermal anomaly monitoring [10]. The coordinate of the THEMIS image is transformed to obtain a complete image with map projection. The diagram of the resulting image is shown in Figure 8.

The CTX data product is the highest-resolution global coverage visible light image available for Mars, with a spatial resolution of 5–6 m and a spectral bandwidth of 0.5–0.8 μm. This panchromatic image is primarily used to provide an analytical background for higher-resolution Mars surface. Images from Martian Year 30 (around 2010), and solar incidence angles between 90 and 180° (Martian northern hemisphere summer), were selected to ensure similar imaging time with thermal infrared images. The downloaded CTX images are rectangles of varying sizes, so we conducted histogram normalization, image mosaic, geometric correction and resampling on the selected CTX images to obtain a stitched, aligned resulting image. The diagram of the resulting image is presented in Figure 9.

The obtained resulting images of visible light and thermal infrared were then cropped and augmented to generate the required training/validation dataset. In order to balance the number of image pairs between different terrain units, data augmentation was performed on the obtained image pairs. Horizontal/vertical flipping augmentation were applied to the images of rough surfaces and impact craters, while horizontal/vertical flipping and 90° rotation augmentation were applied to the images of smooth surface types. Finally, the dataset was divided into validation and training sets in a 7:3 ratios. Detailed parameters of the dataset and example images are provided in Figure 10 and Table 2.

4.3. Comparisons

This paper conducts two experiments to evaluate the effectiveness of our proposed method. Experiment 1 is used to assess the texture quality of the HR-pseudo obtained by super-resolution reconstruction process (step1), and prove its advantages over other typical super-resolution reconstruction methods. Experiment 2 is used to verify the reliability of the thermophysical property maintaining capacity of our proposed step 2; this experiment is conducted by comparing images before and after thermal radiation flux constrain (LR-upscale and Output).

4.3.1. Comparison Between Proposed Thermal Infrared Image Super-Resolution Reconstruction Method and Comparing Methods

Methods from five major categories were selected for comparison, resulting in a total of eight specific comparative experiments. Each method was selected as a representative of its category to ensure a comprehensive and balanced evaluation. These categories include:

(1) Spatial interpolation-based methods, represented by Bicubic interpolation;

(2) CNN-based single image super-resolution methods(SISR), including SRCNN [30] and SRResNet [39];

(3) GAN-based single image super-resolution methods, including ESRGAN [40], Pix2Pix [46], and CycleGAN [47];

(4) Diffusion model-based single image super-resolution methods, including SR3 [48];

(5) Reference-based super-resolution (RefSR) methods, including SRNTT [49].

During the inference phase, none of the methods require any high-resolution reference images. We only put the low-resolution thermal infrared image into the networks to evaluate the effectiveness of their high-resolution output, ensuring comparability in computational cost and runtime. All methods are conducted under the same hardware and software environments: NVIDIA GeForce RTX 3090 GPU with 64 GB RAM and PyTorch (2.0.0) are used as the deep learning framework. The hyper parameters of all networks were kept consistent during training: the gradient optimization algorithm is Adam, the learning rate is set to 0.0002, betas are fixed to 0.5 and 0.999. Due to different inherent architectures of different networks—where GANs and diffusion models typically require more training epochs than CNNs—the number of training epochs used in these comparative experiments varies across all networks. We ensure all networks were trained until their loss functions stabilized converged.

We also evaluate the computational efficiency of all comparison methods. Table 3 shows the categories and parameter counts of all comparison methods. For our method, the forward and backward super-resolution modules (step1 and step2) together contain approximately 61.2 million parameters, which remains within a similar scale of complexity to GAN-based method like Pix2Pix.

A total of five evaluation criterions were selected to assess the texture quality of the HR-pseudo and their correlation with the reference visible light image. Image quality evaluation criterions are used to determine the amount of information contained in the image, including Entropy (EN) and Spatial Frequency (SF). Correlation criterions are used to judge the correlation between resulting image and the original visible light image, including Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM) and Image Fusion Quality Index of Wang and Bovik (

Q_{a b / f}

). Details of evaluation criterions are listed in Table 4.

4.3.2. Comparison Before and After Thermal Radiation Flux Constraints

This experiment is designed to prove the effectiveness of the thermal radiation flux constraints by comparing the difference of image grayscale distribution before and after restriction. In this part, we verify the grayscale distribution consistency by calculating the Mean Square Error (MSE) between the resulting image and the original image.

First, the resulting thermal infrared image was scaled down according to the magnification factor n (Equation (10)), the sum of the thermal radiation flux values in each n × n unit was calculated to obtain

{X_{S R}}^{'}

,

{X_{S R}}^{'}

is of the same size as the original image. The results of the thermal radiation flux within the same coverage range were then verified using MSE (Equation (11)).

{X_{S R}}^{'}_{[i, j]} = {X_{S R}}_{[i * n : i * n + n, j * n : j * n + n]}

(10)

M S E = \frac{1}{H \times W} \sqrt[2]{\sum_{i = 1}^{H} \sum_{j = 1}^{W} {(X (i, j) - {X^{'}}_{S R} (i, j))}^{2}}

(11)

In the formula, X represents the original thermal infrared image,

X_{S R}

represents the obtained resulting thermal infrared image,

{X^{'}}_{S R}

represents the resulting image after n times downsampling, H/W respectively represents the row and column values of the reduced image.

5. Results and Discussion

5.1. Discussion on Experimental Results of Thermal Infrared Image Super-Resolution Reconstrction

Using original visible light images as labels to calculate the objective evaluation metrics for the high-resolution thermal infrared images, the results are shown in Table 5. The first two columns in the table are used to calculate the difference between the entropy (EN) and spatial frequency (SF) of the resulting image and the label, the smaller the better. The last three columns are used to evaluate the correlation between the resulting image and the referencing visible light label image, the larger the better.

From the statistical data, the super-resolution reconstruction method used in this article can achieve best reconstruction results. SF_dist and EN_dist show that the difference between the high-frequency components contained in the resulting image and the original visible light image is the smallest, that means the grayscale distribution in our result is most similar to the original visible light image. PSNR and

Q_{a b / f}

show that the image obtained has the highest image quality and contains the edge texture details in the original image to the greatest extent. The only exception is the SSIM. Bicubic, the method with the blurriest resulting images can achieve best result on this index. That is because this index not only comprises the difference between texture information, but also calculates the similarity of brightness, the difference of grayscale between thermal infrared image and visible light image lowering the resulting value of the overall evaluation index.

The texture accuracy of the resulting images generated by this method is analyzed according to the four classified Mars surface terrain units (Figure 11, Figure 12, Figure 13 and Figure 14). In these four comparison figures, the visible light image label (Visible Light Image) and the texture reference for original thermal infrared image (Bicubic) are listed on the top-left side. The results of Bicubic are obtained by interpolation, without introducing the visible light texture, so it can be used as a texture reference of the input image. Other images in the figure separately show the result of comparisons and our proposed method.

For all comparison methods, as can be seen from the images below, the results obtained by SRCNN are relatively blurry and the features on the resulting image are meaningless. SRRESNET maintains the shape of original terrain units, but it fails to introduce clear visible light reference textures, comparing with Bicubic, these methods only enhance the edge detail to a small extent, showing it is difficult to achieve good results on large-scale heterogeneous image super-resolution using CNN based methods. SR3 failed to restore credible textures on the Mars scenes, the positions and shapes of the generated objects are inconsistent with the original input images, especially on relatively smooth surface. The resulting images obtained by ESRGAN and Pix2Pix are clear, but they are very different from the original visible light labels: crater structures and detailed ridge directions cannot be identified, rough surface becomes plain, only smooth surface terrain can maintain the overall shape. These inaccuracies of results may due to the alignment error of training dataset. These two methods are paired image-based methods, the heterogeneity and significant resolution difference between thermal infrared images and visible light images result in low similarity between LR and HR images, and may further influences the final high-resolution results. The results of SRNTT can maintain the relative position information of textures in the result images well (Figure 11 and Figure 14), but due to the relatively simple backbone network structure, the accuracy of restoring edge details is not good (Figure 12). The results obtained by the CycleGAN and our method are two of the most effective methods throughout all comparisons. These two methods can both maintain original feature locations and shapes, and can introduce texture details at the same time. CycleGAN can maintain the high frequency shape of original terrain unit (Figure 11), and significantly enhance the clarity of features (Figure 13 and Figure 14). However, CycleGAN is prone to produce meaningless high-frequency texture information on smooth surfaces (Figure 11 and Figure 13), and its ability to fuse visible light reference images is poor.

Our method performs best among all comparisons. It can restore the location and whole ring structure of craters and suppress irrelevant texture details to the greatest extent (Figure 11); maintain the ground relief and surface texture direction in the rough surface terrain unit (Figure 12); retain the significant structural features of the original image in the smooth surface terrain unit, making the resulting image consistent with the original visible light image to the greatest extent (Figure 13), and reflect the direction and detailed texture of the ridge line (Figure 14). The shortcoming of this method is that the light-shade relationship in the resulting image is not significant, which can be overcome in the second step.

5.2. Discussion on Experimental Results of Thermal Radiation Flux Constraint

The objective evaluation metrics of the resulting image before and after thermal radiation flux consistency constraint are shown in Table 6. The first evaluation metric is MSE, this metric presents the difference of grayscale between resulting image and original image, as can be seen, the error after consistency constraint dropped by nearly ten times, indicating that the gap in thermal radiation properties in these images has been greatly improved, and the grayscale is maintained.

The 2–6 columns are texture quality evaluation metrics, which are the same as we used in the first experiment. These metrics are used to evaluate the quality of the resulting image and its similarity with the original visible light/thermal infrared image. From the statistical data, the image after constraint shows a higher EN and SF, indicating there are more information contained in the resulting image after consistency constraint. What is more, this step improves the similarity in brightness and structure between the result image and the original image, improves the structural similarity index (SSIM), peak signal-to-noise ratio (PSNR) and fusion quality (

Q_{a b / f}

) of the obtained image.

Figure 15, Figure 16, Figure 17 and Figure 18, respectively, show the differences in grayscale values in the four terrain units before and after thermal radiation flux consistency constraint. From left to right in each picture are the visible light label image, the image after bicubic interpolation, the image before consistency constraint, the difference map between the image before consistency constraint, and the original thermal infrared image, the image after consistency constraint, and the difference map between the resulting image after consistency constraint and the original thermal infrared image. The difference map uses a color bar to indicate the specific difference range. The darker the color, the greater the difference from the original visible light image label.

It can be seen that due to the influence of visible light reference image, the grayscale distribution of the high-resolution resulting image before thermal radiation flux consistency constraint is unclear, the contrast is low and the overall boundary is blurred, which is very different from the original thermal infrared image. The difference maps show a clearer grayscale error, the darker the color in the image, the greater the error is between the resulting image and the original thermal infrared image.

The constraint reduced the error of the gray scale distribution: In the crater terrain unit, the gray value of the image before constraint is uniform, the bright and dark surfaces and circular structure are not obvious, all these problems are improved after the thermal radiation flux consistency constraint (Figure 15). In the rough surface terrain unit, the image before the constraint is not able to emphasize the incident direction of light, and not able to reflect the concave and convex details of the terrain. After the constraint, the grayscale distribution is basically consistent with the original thermal infrared image (Figure 16). In the smooth surface terrain units, the constraint shows the best result, the grayscale distribution characteristics are more significant, and as we can observe from the difference map, the absolute value of the error is smaller and more evenly distributed (Figure 17). In the ridge terrain unit, before constraint the dark texture details are not obvious, after the constraint the texture in the dark area are clearer and the grayscale excess is smoother (Figure 18).

In conclusion, the thermal radiation flux constraint can improve most of the grayscale error after super-resolution. Most of the remaining errors in the resulting image are in the edge of the ground objects, which is inevitable due to the improvement of spatial resolution. Compared with the original image, the constrained image has a better effect in the area with obvious contrast between light and dark, while the constrained image has a relatively weak ability in the area with gentle change in gray. We use neural networks to constrain the thermal infrared radiant flux in the multi-scale constraint step, comparing with the sliding window method which directly constrains the sum of corresponding pixel values, our method can retain the integrity of edge details, but this method cannot achieve a complete one-to-one correspondence in the mathematical sense, at present the improved effect in the resulting image meet the expectations in the quantitative evaluation.

5.3. Application

This section is used to demonstrate the results obtained by the super-resolution reconstruction method proposed in this article. In this section, the original 100 m resolution low-resolution thermal infrared images of the South Utopia Planitia study area (29.19–33.48N, 84.28–89.85E) are reconstructed into 25 m resolution and the resulting images are stitched together to form a full-sized image (the resultant image is shown in Figure 19 and the original image is shown in Figure 9). Four terrain units within the study area (shown in yellow wireframes in Figure 19 and numbered A/B/C/D respectively) are selected for zoom-in display (Figure 20, Figure 21, Figure 22 and Figure 23), which cover all four terrain units of crater, rough surface, smooth surface, and ridge as categorized in Section 2.1.

It can be seen that after using the method proposed in this article for super-resolution reconstruction, the resolution of the original thermal infrared image has been significantly improved, making the edges of fuzzy structures, such as crater rims and protrusions, more distinct. The areas with large thermal radiation differences on the smooth surface are preserved, while the original thermal radiation distribution, such as the bright and dark surfaces of the ridges, remains consistent. The image after super-resolution reconstruction can effectively maintain the grayscale distribution in the original image.

6. Conclusions

As an important means of planetary exploration, thermal infrared remote sensing of Mars has been widely used in research scenarios such as monitoring the features of thermal radiation and analyzing global climate change. Currently, the highest resolution of Mars thermal infrared images is only 100 m, which is relatively low and restricts the detection of fine-scale thermal anomalies, hinders the generation of high-precision thermal inertia maps and surface compositional maps. This paper proposed a super-resolution reconstruction method for Mars thermal infrared remote sensing images by integrating multi-source data, the proposed approach enhances the spatial resolution of Mars thermal infrared images from 100 m to 25 m. Through this method, we not only enhanced the spatial resolution to produce good visual effects but also maintains the consistency of thermal radiation flux, enabled the reliability of further scientific research. This paper addressed two key challenges in the super-resolution reconstruction of Mars thermal infrared images: the generation of reliable high-resolution texture details and the preservation of thermal radiation flux consistency. The key contributions of this work are summarized as follows:

A super-resolution reconstruction method referenced by visible light image is proposed. Based on the principle of domain adaptation, a super-resolution network with a two-stage upsampling structure is designed, which utilizes visible light image textures as references. This network is trained on unpaired datasets with distinct domain characteristics, which employed to enable cross-domain representation alignment. Experimental results demonstrate that the proposed method significantly improves the visual quality of Mars thermal infrared imagery and maintains its correlation with the original imagery at the same time.
A thermal radiation flux consistency constraint method is proposed. Based on the original low-resolution thermal infrared image, a Gradient-extraction network and an Intensity-extraction network are designed to separate the high-frequency boundary information and low-frequency background information in the original image. The consistency of the thermal radiation flux is then constrained by controlling the sum of pixel values between the resulting image and the original thermal infrared image within the same coverage. Experimental results demonstrate that this method can ensure the consistency of thermal radiation flux, making the resulting images more suitable for scientific research related to thermal radiation properties.

Author Contributions

Conceptualization, C.S.; methodology, C.L. and C.S.; writing—original draft preparation, C.L.; All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFB0505002.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yu, D.; Sun, Z.; Meng, L.; Shi, D. Development History and Future Prospects of Mars Exploration. J. Deep Space Explor. 2016, 3, 108–113. (In Chinese) [Google Scholar] [CrossRef]
Gou, S.; Yue, Z.; Di, K.; Zhang, X. Progress in the Detection of Hydrous Minerals on the Martian Surface. J. Remote Sens. 2017, 21, 531–548. (In Chinese) [Google Scholar] [CrossRef]
Tao, Y.; Conway, S.J.; Muller, J.-P.; Putri, A.R.D.; Thomas, N.; Cremonese, G. Single Image Super-Resolution Restoration of TGO CaSSIS Colour Images: Demonstration with Perseverance Rover Landing Site and Mars Science Targets. Remote Sens. 2021, 13, 1777. [Google Scholar] [CrossRef]
Li, C.; Zhang, R.; Yu, D.; Dong, G.; Liu, J.; Geng, Y.; Sun, Z.; Yan, W.; Ren, X.; Su, Y.; et al. China’s Mars Exploration Mission and Science Investigation. Space Sci. Rev. 2021, 217, 57–81. [Google Scholar] [CrossRef]
Malin, M.C.; Bell, J.F.; Cantor, B.A.; Caplinger, M.A.; Calvin, W.M.; Clancy, R.T.; Edgett, K.S.; Edwards, L.; Haberle, R.M.; James, P.B.; et al. Context Camera Investigation on Board the Mars Reconnaissance Orbiter. J. Geophys. Res. 2007, 112, 1–25. [Google Scholar] [CrossRef]
Wilhelm, T.; Geis, M.; Püttschneider, J.; Sievernich, T.; Weber, T.; Wohlfarth, K.; Wöhler, C. DoMars16k: A Diverse Dataset for Weakly Supervised Geomorphologic Analysis on Mars. Remote Sens. 2020, 12, 3981. [Google Scholar] [CrossRef]
Christensen, P.R.; Jakosky, B.M.; Kieffer, H.H.; Malin, M.C.; McSween, H.Y., Jr.; Nealson, K.; Mehall, G.L.; Silverman, S.H.; Ferry, S.; Caplinger, M.; et al. The Thermal Emission Imaging System (Themis) for the Mars 2001 Odyssey Mission. In Mars Odyssey Mission; Springer: Dordrecht, The Netherlands, 2004; pp. 85–130. [Google Scholar] [CrossRef]
Jakosky, B.M.; Lin, R.P.; Grebowsky, J.M.; Luhmann, J.G.; Beutelschies, G.; Priser, T.; Acuna, M.; Andersson, L.; Baird, D.; Baker, D.; et al. The Mars Atmosphere and Volatile Evolution (MAVEN) Mission. Space Sci. Rev. 2015, 195, 3–48. [Google Scholar] [CrossRef]
Edwards, C.S.; Nowicki, K.J.; Christensen, P.R.; Hill, J.; Gorelick, N.; Murray, K. Mosaicking of Global Planetary Image Datasets: 1. Techniques and Data Processing for Thermal Emission Imaging System (THEMIS) Multi-Spectral Data. J. Geophys. Res. 2011, 116, 3755–3772. [Google Scholar] [CrossRef]
Christensen, P.R.; Bandfield, J.L.; Hamilton, V.E.; Ruff, S.W.; Kieffer, H.H.; Titus, T.N.; Malin, M.C.; Morris, R.V.; Lane, M.D.; Clark, R.L.; et al. Mars Global Surveyor Thermal Emission Spectrometer Experiment: Investigation Description and Surface Science Results. J. Geophys. Res. Atmos. 2001, 106, 1370. [Google Scholar] [CrossRef]
Christensen, P.R.; Bandfield, J.L.; Bell, J.F., III; Gorelick, N.; Hamilton, V.E.; Ivanov, A.; Jakosky, B.M.; Kieffer, H.H.; Lane, M.D.; Malin, M.C.; et al. Morphology and composition of the surface of Mars: Mars Odyssey THEMIS results. Science 2003, 300, 2056–2061. [Google Scholar] [CrossRef]
Christensen, P.R. Formation of recent martian gullies through melting of extensive water-rich snow deposits. Nature 2003, 422, 45–48. [Google Scholar] [CrossRef] [PubMed]
Sharma, R.; Srivastava, N. Detection and Classification of Potential Caves on the Flank of Elysium Mons, Mars. Res. Astron. Astrophys. 2022, 22, 10–21. [Google Scholar] [CrossRef]
Hughes, C.G.; Ramsey, M.S. Super-resolution of THEMIS thermal infrared data: Compositional relationships of surface units below the 100 meter scale on Mars. Icarus 2010, 208, 704–720. [Google Scholar] [CrossRef]
Wang, Q.; Ma, W.; Liu, S.; Tong, X.; Atkinson, P.M. Data Fidelity-Oriented Spatial-Spectral Fusion of CRISM and CTX Images. ISPRS J. Photogramm. Remote Sens. 2025, 220, 172–191. [Google Scholar] [CrossRef]
Cheng, K.; Rong, L.; Jiang, S.; Zhan, Y. Overview of Remote Sensing Image Super-Resolution Reconstruction Technology Based on Deep Learning. J. Zhengzhou Univ. (Eng. Ed.) 2022, 43, 8–16. (In Chinese) [Google Scholar] [CrossRef]
Chen, H.; He, X.; Qing, L.; Wu, Y.; Ren, C.; Sheriff, R.E.; Zhu, C. Real-World Single Image Super-Resolution: A Brief Review. Inf. Fusion 2022, 79, 124–145. [Google Scholar] [CrossRef]
Liu, H.; Qian, Y.; Zhong, X.; Chen, L.; Yang, G. Research on super-resolution reconstruction of remote sensing images: A comprehensive review. Opt. Eng. 2021, 60, 100901. [Google Scholar] [CrossRef]
Tao, H.; Tang, X.; Liu, J.; Tian, J.; Ungar, S.G.; Mao, S.; Yasuoka, Y. Superresolution Remote Sensing Image Processing Algorithm Based on Wavelet Transform and Interpolation. In Proceedings of SPIE, Hangzhou, China, 24–26 October 2002. [Google Scholar] [CrossRef]
Kim, S.P.; Su, W.-Y. Recursive High-Resolution Reconstruction of Blurred Multiframe Images. IEEE Trans. Image Process. 1994, 2, 534–539. [Google Scholar] [CrossRef]
Wang, W. Research on Infrared Super-Resolution Algorithms Based on Generative Adversarial Networks. Ph.D. Dissertation, University of Electronic Science and Technology of China, Chengdu, China, 2021. (In Chinese). [Google Scholar]
Keys, R. Cubic Convolution Interpolation for Digital Image Processing. IEEE Trans. Acoust. Speech Signal Process. 1981, 29, 1153–1160. [Google Scholar] [CrossRef]
Hou, H.; Andrews, H. Cubic Splines for Image Interpolation and Digital Filtering. IEEE Trans. Acoust. Speech Signal Process. 1978, 26, 508–517. [Google Scholar] [CrossRef]
Zhang, X.; Wu, X. Image Interpolation by Adaptive 2-D Autoregressive Modeling and Soft-Decision Estimation. IEEE Trans. Image Process. 2008, 17, 887–896. [Google Scholar] [CrossRef] [PubMed]
Teoh, K.K.; Ibrahim, H.; Bejo, S.K. Investigation on Several Basic Interpolation Methods for the Use in Remote Sensing Application. In Proceedings of the Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, Cyberjaya, Malaysia, 12–13 July 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 60–65. [Google Scholar] [CrossRef]
Henry, S.; Peyma, O. High-Resolution Image Recovery from Image-Plane Arrays, Using Convex Projections. J. Opt. Soc. Am. A 1989, 6, 1715–1726. [Google Scholar] [CrossRef]
Yang, X. Research on Frequency Domain and Spatial Domain Super-Resolution Reconstruction Technology of Remote Sensing Images. Ph.D. Dissertation, Harbin Institute of Technology, Harbin, China, 2024. (In Chinese). [Google Scholar]
Patti, A.J.; Sezan, M.I.; Tekalp, A.M. Robust Methods for High-Quality Stills from Interlaced Video in the Presence of Dominant Motion. IEEE Trans. Circuits Syst. Video Technol. 1997, 7, 328–342. [Google Scholar] [CrossRef]
Shang, L.; Liu, S.-F.; Sun, Z.-L. Image Super-Resolution Reconstruction Based on Sparse Representation and POCS Method. In Proceedings of the International Conference on Intelligent Computing, Fuzhou, China, 20–23 August 2015; Springer: Cham, Switzerland, 2015; pp. 348–356. [Google Scholar] [CrossRef]
Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a Deep Convolutional Network for Image Super-Resolution. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 184–199. [Google Scholar] [CrossRef]
Wang, Z.; Chen, J.; Hoi, S.C.H. Deep Learning for Image Super-Resolution: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3365–3387. [Google Scholar] [CrossRef] [PubMed]
Anwar, S.; Khan, S.; Barnes, N. A Deep Journey into Super-Resolution: A Survey. ACM Comput. Surv. 2020, 53, 1–34. [Google Scholar] [CrossRef]
Kim, J.; Lee, J.K.; Lee, K.M. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1646–1654. [Google Scholar]
Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1132–1140. [Google Scholar] [CrossRef]
Ahn, N.; Kang, B.; Sohn, K.-A. Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; IEEE: Piscataway, NJ, USA, 2018. [Google Scholar]
Tong, T.; Li, G.; Liu, X.; Gao, Q. Image Super-Resolution Using Dense Skip Connections. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017. [Google Scholar]
Kim, J.; Lee, J.K.; Lee, K.M. Deeply Recursive Convolutional Network for Image Super-Resolution. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 4809–4817. [Google Scholar] [CrossRef]
Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018; pp. 294–310. [Google Scholar]
Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.P.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 105–114. [Google Scholar] [CrossRef]
Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Change Loy, C. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 63–79. [Google Scholar] [CrossRef]
Li, J.; Zi, S.; Song, R.; Li, Y.; Hu, Y.; Du, Q. A Stepwise Domain Adaptive Segmentation Network with Covariate Shift Alleviation for Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5618515. [Google Scholar] [CrossRef]
Yuan, Y.; Liu, S.; Zhang, J.; Zhang, Y.; Dong, C.; Lin, L. Unsupervised Image Super-Resolution Using Cycle-in-Cycle Generative Adversarial Networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 81401–81409. [Google Scholar] [CrossRef]
McGill, G.E. Buried Topography of Utopia, Mars; Persistence of a Giant Impact Depression. Geophys. Res. Lett. 1989, 94, 2753–2759. [Google Scholar] [CrossRef]
Séjourné, A.; Costard, F.; Gargani, J.; Soare, R.J.; Marmo, C. Evidence of an Eolian Ice-Rich and Stratified Permafrost in Utopia Planitia, Mars. Planet. Space Sci. 2012, 60, 248–254. [Google Scholar] [CrossRef]
Wu, B.; Dong, J.; Wang, Y.; Li, Z.; Chen, Z.; Liu, W.C.; Zhu, J.; Chen, L.; Li, Y.; Rao, W. Characterization of the Candidate Landing Region for Tianwen-1—China’s First Mission to Mars. Earth Space Sci. 2021, 8, e2021EA001670. [Google Scholar] [CrossRef]
Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1125–1134. [Google Scholar] [CrossRef]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2223–2232. [Google Scholar] [CrossRef]
Saharia, C.; Ho, J.; Chan, W.; Salimans, T.; Fleet, D.J.; Norouzi, M. Image Super-Resolution via Iterative Refinement. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 4713–4726. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, Z.; Lin, Z.; Qi, H. Image super-resolution by neural texture transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7982–7991. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of proposed super-resolution reconstruction method. (a) Super-resolution reconstruction with visible light texture reference. (b) Thermal radiation flux consistency constraints.

Figure 2. Diagram of upsampling module schematic.

Figure 3. Structure of the generators.

Figure 4. Structure of the discriminators.

Figure 5. Diagram of Thermal radiation flux Consistency Constraint Network.

Figure 6. Calculation process of intensity loss function.

Figure 7. The location of Study Area and Supplementary Sample Selection Areas.

Figure 8. Low-resolution thermal infrared imagery of the study area.

Figure 9. High-resolution visible light imagery of the study area.

Figure 10. Example images of training/validation dataset.

Figure 11. Comparisons between output results (crater).

Figure 12. Comparisons between output results (rough surface).

Figure 13. Comparisons between output results (smooth surface).

Figure 14. Comparisons between output results (ridge).

Figure 15. Comparison of images before and after constraint (crater).

Figure 16. Comparison of images before and after constraint (rough surface).

Figure 17. Comparison of images before and after constraint (smooth surface).

Figure 18. Comparison of images before and after constraint (ridge).

Figure 19. Super-resolution reconstructed result of the entire study area.

Figure 20. Enlarged display of rough surface in area A.

Figure 21. Enlarged display of smooth surface in area B.

Figure 22. Enlarged display of crater in area C.

Figure 23. Enlarged display of ridge in area D.

Table 1. Table of Visible Light Image and Thermal Infrared Image Data Source parameters.

Mission	Launch Time	Mission Goal	Sensors	Spatial Resolution
2001 Mars Odyssey Mission	2001.4	Draw a global map depicting distribution of minerals on Mars	Thermal Emission Imaging System (THEMIS)	VIS: 18 m IR: 100 m
Mars Reconnaissance Orbiter	2005.8	Obtaining high-resolution images of the Martian surface.	Context Imager (CTX)	VIS: 5–6 m

Table 2. Table of detailed training/validation dataset parameters.

Topographic Unit	Image Band	Image Size	Image Resolution	Channel	Image Number
Crater	VIS	256 × 256	25 m	1	4881
Crater	IR	64 × 64	100 m	1	4881
Smooth Surface	VIS	256 × 256	25 m	1	5280
Smooth Surface	IR	64 × 64	100 m	1	5280
Rough Surface	VIS	256 × 256	25 m	1	6942
Rough Surface	IR	64 × 64	100 m	1	6942
Ridge	VIS	256 × 256	25 m	1	8003
Ridge	IR	64 × 64	100 m	1	8003

Table 3. Table of parameter counts of comparison methods.

	Category	Network	Parameter Counts
Comparisons	spatial interpolation	Bicubic	-
	CNN-based SISR	SRCNN	57,281
	CNN-based SISR	SRRESNET	1,536,384
	GAN-based SISR	ESRGAN	16,697,987 (generator only)
		Pix2Pix	54,407,809 (generator only)
		CycleGAN	11,365,633 (generator only)
	Diffusion model-based SISR	SR3	27,436,547 (generator only)
	RefSR	SRNTT	5,746,246
Ours		Ours	11,365,633 (step1) + 49,868,971(step2)

Table 4. Table of detailed evaluation criterions parameters.

	Criterions	Equation	Parameters	Description
Image Quality	Entropy (EN)	$E N = - \sum_{n = 0}^{N - 1} p_{n} \log_{2} p_{n}$	N is the gray scale of the image; $p_{n}$ is the probability of the gray scale;	Measures the complexity of the information in the image
	Spatial Frequency (SF)	$S F = \sqrt[2]{{R F}^{2} + {C F}^{2}}$	RF is the row change rate; CF is the column change rate; H/W is the length and width of resulting image; $X_{S R} (i, j)$ is the value of pixel (i, j);	Measures the sharpness of the texture
		$R F = \frac{1}{H \times W} \sum_{i = 1}^{H - 1} \sum_{j = 1}^{W - 1} {[X_{S R} (i, j) - X_{S R} (i, j + 1)]}^{2}$
		$C F = \frac{1}{H \times W} \sum_{i = 1}^{H - 1} \sum_{j = 1}^{W - 1} {[X_{S R} (i, j) - X_{S R} (i + 1, j)]}^{2}$
Correlation with the Original Image	Peak Signal-to-Noise Ratio (PSNR)	$M S E = \frac{1}{N} \sum_{i = 1}^{N} {‖X (i) - X_{S R} (i)‖}_{2}$	X is the target image, $X_{S R}$ is the resulting image; L is the maximum value of color contained in the image;	Calculates the ratio between the energy of the peak signal and the energy of the noise
	Peak Signal-to-Noise Ratio (PSNR)	$P S N R = 10 {l o g}_{10} \frac{L^{2}}{M S E}$
	Structural Similarity Index Measure (SSIM)	$S S I M ({X, X}_{S R}) = \frac{(2 μ_{X} μ_{X_{S R}} + C_{1}) (σ_{X X_{S R}} + C_{2})}{(μ_{X}^{2} + μ_{X_{S R}}^{2} + C_{1}) (σ_{X}^{2} + σ_{X_{S R}}^{2} + C_{1})}$	X is the target image; $X_{S R}$ is the resulting image; $μ_{X_{S R}}$ / $σ_{X_{S R}}$ are the mean and standard deviation of the image, $σ_{{X X}_{S R}}$ is the covariance between X and $X_{S R}$ ; C1/C2 are constant for result stability	Measures the similarity between two images. Average represents brightness, standard deviation represents contrast, correlation coefficient represents the structure similarity.
	Image Fusion Quality Index of Wang And Bovik $(Q_{a b / f})$	$Q_{a b / f} = \frac{1}{\|W\|} \sum_{ω ϵ W} (λ (ω) Q_{0} (a, f\| ω) + (1 - λ (ω)) Q_{0} (b, f\| ω))$	a/b represent two images for fusion; ω is the sliding window; s represents the significance; $Q_{0} (a, b\| ω)$ is the local evaluation index between $a (i, j)$ and $b (i, j)$ ; $Q_{0} (a, b)$ represents the sum of the quality evaluations in the sliding window $s (a\| ω)$ represents the significance of image a in the window ω	Measures the relative amount of edge information transferred from the original image to the resulting image; Evaluates the fusion quality based on sliding window.
		$λ (ω) = \frac{s (a \| ω)}{s (a\| ω) + s (b \| ω)}$
		$Q_{0} (a, b) = \frac{1}{\|W\|} \sum_{ω ϵ W} Q_{0} (a, b\| ω)$

Table 5. Table of evaluation metrics of super-resolution reconstruction result (Bold represents the maximum value).

	EN_dist- (Output—vis)	SF_dist- (Output—vis)	SSIM+	PSNR+	$Q_{a b / f}$ +
Bicubic	−0.0628	−8.6128	0.4282	16.7767	0.1526
SRCNN	0.5942	−5.8568	0.3488	15.7262	0.1584
SRRESNET	0.9095	−3.5941	0.3150	15.3814	0.1708
ESRGAN	0.6023	2.5644	0.3504	15.3241	0.1679
SR3	0.4175	−3.5102	0.3347	15.2946	0.1559
Pix2Pix	0.1269	−2.8519	0.3475	17.0112	0.1625
CycleGAN	0.3247	−2.8792	0.2724	16.7694	0.1775
SRNTT	0.8408	−5.5545	0.5935	16.6444	0.1722
Ours	0.0421	−0.4089	0.3344	17.9100	0.1778

Table 6. Table of evaluation metrics before and after thermal radiation flux consistency constraint (Bold represents the maximum value).

	MSE-	EN+	SF+	SSIM+	PSNR+	$Q_{a b / f}$ +
Before Constraint	0.0090	5.2781	9.7518	0.4547	18.7514	0.1868
After Constraint	0.0010	6.1784	12.7308	0.5444	20.6517	0.2021

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lu, C.; Su, C. Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data. Remote Sens. 2025, 17, 2115. https://doi.org/10.3390/rs17132115

AMA Style

Lu C, Su C. Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data. Remote Sensing. 2025; 17(13):2115. https://doi.org/10.3390/rs17132115

Chicago/Turabian Style

Lu, Chenyan, and Cheng Su. 2025. "Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data" Remote Sensing 17, no. 13: 2115. https://doi.org/10.3390/rs17132115

APA Style

Lu, C., & Su, C. (2025). Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data. Remote Sensing, 17(13), 2115. https://doi.org/10.3390/rs17132115

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Super Resolution Reconstruction of Mars Thermal Infrared Remote Sensing Images Integrating Multi-Source Data

Abstract

1. Introduction

2. Related Works

2.1. Reconstruction-Based Method

2.2. Learning-Based Method

3. Methods

3.1. Super-Resolution Reconstruction of Thermal Infrared Images with Visible Light Texture Reference

3.2. Consistency Constraints on Thermal Radiation Flux

3.2.1. Intensity-Extraction Network

3.2.2. Gradient-Extraction Network

3.2.3. Fusion Network

4. Experiment

4.1. Study Area

4.2. Dataset

4.3. Comparisons

4.3.1. Comparison Between Proposed Thermal Infrared Image Super-Resolution Reconstruction Method and Comparing Methods

4.3.2. Comparison Before and After Thermal Radiation Flux Constraints

5. Results and Discussion

5.1. Discussion on Experimental Results of Thermal Infrared Image Super-Resolution Reconstrction

5.2. Discussion on Experimental Results of Thermal Radiation Flux Constraint

5.3. Application

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI