1. Introduction
The thermal infraRed (TIR) spectrum provides valuable information about the radiation emitted by objects in the wavelength range between 7 and 14 µm. At these wavelengths, the intensity of the emitted radiation depends on the temperature of the objects and on their infrared emissivity. For this reason, data acquired from thermal infrared sensors provide information about the temperature of objects, rather than about their reflectance. In addition, since temperature is an intrinsic property of objects, thermal infrared imagery reveals information about the internal state of objects that influences their surface. Therefore, thermal infrared imagery provides details about the subsurface state of bodies, in contrast with the rest of the infrared spectrum, which only reflects surface conditions.
This unique properties of TIR have made TIR imagery an increasingly used technique for different applications where the temperature of objects is crucial for detecting faults, diagnosing their state and provenance, and assessing their severity. Examples of these applications range from evaluating building energy efficiency [
1], preserving heritage elements [
2], diagnosing medical conditions [
3] and sport injuries [
4] in both humans and animals [
5], determining the dryness of crops [
6], and monitoring vegetation for fire prevention [
7].
Many of these applications involve the study of small areas that can be monitored using ground-based cameras operated by humans. However, other applications require mobile acquisitions and aerial perspectives. Consequently, thermal infrared sensors have been integrated into satellite platforms since the launch of Landsat 3 in 1978. Thermal infrared satellite imagery has enabled several studies including crop water stress [
8] and precision agriculture [
9], as well as energy-related research such as determining urban heat islands [
10] and detecting geothermal activity [
11]. Additionally, the use of TIR from satellite platforms has improved the response and evaluation of emergencies such as wildfires [
12] and volcanic anomalies [
13].
However, current TIR satellite imagery faces a significant limitation regarding its spatial resolution.
Table 1 presents the TIR sensors currently integrated into satellite platforms with freely downloadable data, along with their technical characteristics, including spatial and temporal resolutions. While the temporal resolution offers several alternatives that are useful for most applications, the spatial resolution remains limited even at its highest capability.
According to [
14], reliable temperature measurements require that the object be covered by a minimum of three pixels in the TIR image. In their study on river temperatures, only large rivers with a minimum width of approximately 90 m could be analyzed, as the maximum spatial resolution in current satellite platforms is 30 m. In cases where the object of interest has smaller dimensions or appears at the edge of a pixel, sub-pixel heterogeneity can occur. This means that the pixels in the remotely sensed image consist of multiple surface types, rather than a single homogeneous one.
One of the solutions to the limited spatial resolution of TIR satellite imagery is the use of Unmanned Aerial Vehicles (UAVs), where the pixel size on the surface can be adjusted by modifying the flight altitude. However, UAVs have a set of limitations, including the need for a skilled operator, the investment in equipment, and a reduced scale for studies compared to satellite imagery.
In order to address this issue, several techniques and methodologies have been developed with the aim of synthetically enhancing the spatial resolution of TIR imagery, while maintaining the accuracy of the temperature measurement provided by the sensors.
Table 2 summarizes these techniques, outlining their requirements, advantages, and limitations. More information about techniques for increasing spatial resolution in satellite imagery can be found in [
15].
As shown in
Table 2, a commonly used technique such as pan-sharpening requires a reference image in the same band as the target image, but with the desired higher resolution. However, pan-sharpening is not applicable when improving the spatial resolution of one band using high-resolution data from another band, in a process known as band-blending. In such cases, Unmixing is the preferred option, as it has fewer wavelength limitations. ‘Unmixing’ involves reducing sub-pixel heterogeneity by creating sub-pixels.
Additionally, comparative studies, such as [
20], indicate that techniques yielding the most radiometrically accurate sharpening often do not produce synthetic images with the best visual quality. The latter is obtained by emphasizing the texture of the high-resolution image in the sharpened image, sometimes at the expense of its radiometric accuracy.
These characteristics of the techniques make Unmixing the technique used in this paper for the increase in spatial resolution of satellite imagery in the thermal infrared. Provided that images in the thermal infrared present the lowest spatial resolution of all bands, images that can be used for reference to increase this variable are acquired in other spectral bands, commonly using different sensors. For this reason, the Unmixing multisensory multiresolution technique is selected for this study, based on the developments of [
21]. The contribution of this paper is the performance of a refinement of the methodology, evaluating different applications modes. The first contribution consists of the determination of the optimal classification method to determine the class of each pixel in the high-resolution image, which affects the creation of sub-pixels for the reduction of sub-pixel heterogeneity. The second contribution stands in the methodology followed to calculate the sub-pixel values: according to [
20], the optimal moving window dimensions is 5 × 5, which shows a good compromise between an acceptable scale of spatial averaging, the allowed number of classes, and the stability of the inversion. In this paper, a minimum moving window with the shape of a cross is evaluated, followed by moving windows with dimensions 3 × 3, 5 × 5, 7 × 7, and 9 × 9. Additionally, the application of closeness to the center coefficients is analyzed. Thus, results show the optimal methodology design, focusing on its application to ASTER satellite imagery.
The paper is organized as follows: after this introductory
Section 1,
Section 2 describes the materials and methods used for the design of the optimal methodology for multisensor multiresolution Unmixing.
Section 3 presents the results obtained for the different parameters to evaluate, which are further discussed in
Section 4. Last,
Section 5 presents the conclusions reached with this study.
2. Materials and Methods
In this work, we analyzed the accuracy of different Unmixing methodologies applied to TIR images from the ASTER sensor (90 m spatial resolution), to enhance their resolution by a factor of three, achieving final 30 m synthetic ASTER TIR images. To accomplish this, RGB images from the TIRS sensor on the Landsat 8-9 satellites are used as a reference for determining the pixel clusters that describe the sub-pixel heterogeneity in ASTER images. SDGSAT-1/TIS was also considered as a reference due to its 30 m spatial resolution, but the satellite pass time does not coincide with that of Terra, so the comparison of thermal data would not be exhaustive. Additionally, the validity of the 30 m Landsat 8-9 thermal data is considered probed by several studies, as [
22,
23]: ref. [
22] confirms that single resampled ARD provides consistent results, with minimal differences from double resampled data. Additionally, a comprehensive evaluation of Landsat 8 recalibrations, as presented in [
23], confirms the accuracy of post-2019 recalibrations.
We evaluate different approaches in the Unmixing process, including two classification algorithms, variations in the number of clusters, four matrix sizes, and different matrix weights (
Figure 1).
The resulting synthetic ASTER TIR images are subjected to a validation process by comparing them with TIR images from the TIRS sensor on Landsat 8–9 (30 m spatial resolution) at the same date and location that serve as references. This approach was implemented in seven case studies: two municipalities in Spain (Segovia and Etxalar), two in France (Orange and Marseille), two in Italy (Genova and Ramacca), and one in Finland (Vakka-Suomi).
The data and methodologies analyzed are detailed below, along with the case studies used for validation.
2.1. Satellite Imagery
The source images intended for improvement are TIR images from ASTER, specifically the thermal images resulting from averaging bands B13 and B14, as these have spectral characteristics most similar to band 10 of the Landsat 8-9 TIRS sensor, which will be used for validation. The TIR images from both ASTER and TIRS are at processing level 1 with terrain correction, providing Top-Of-Atmosphere (TOA) radiance values in Wm
−2sr
−1 µm
−1 (specifically AST_L1T for ASTER and L1TP for TIRS downloaded from
https://earthexplorer.usgs.gov/, accessed on 20 June 2024). The characteristics of the data used by both sensors, as well as the high-resolution (HR) RGB images with TOA reflectance values from the OLI sensor on board Landsat 8–9 used in the Unmixing technique, are described in
Table 3.
2.2. Unmixing
The Unmixing methodology applied in this work is based on the approach presented by [
20] and applied in [
21] for the generation of high-temporal and -spatial resolution TIR image data but focusing on the high-spatial resolution workflow. Thus, a linear Unmixing is applied, based on the spectral assumptions detailed in
Section 2.2.1. In this case, each sub-pixel in the high-resolution synthetic TIR image is defined as a pure component of a land use, which is determined according to the response of the class in the RGB bands. The approach is similar to that applied in [
24], but with a priori unknown classes.
2.2.1. Assumptions
The developed Unmixing algorithm relies on clusters derived from HR reflective images to compute each cluster contribution to low-resolution pixels (denoted as LR pixels). This process estimates the values of unmixed high-resolution pixels according to the fractional abundance of each cluster in the LR pixel. Given the disparity between wavelengths used (visible and thermal ranges), an assumption is made regarding the validity of the algorithm since linear statistical predictors provide good radiometric accuracy if the correlation between the high- and low-resolution bands is strong [
20]: variations in thermal properties correlate with variations in the reflectivity of the bodies.
When this correlation does not hold, areas with different thermal characteristics may be grouped into a single cluster in the reflective images, resulting in uniform values assigned to these areas. This effect diminishes thermal variability within the images, consequently affecting the accuracy of the subsequent models that rely on estimated temperature data. To mitigate this issue, this paper proposes the moving-window approach: instead of averaging the cluster values for the entire image, values are estimated within the sub-image, according to a defined moving-window, dimensions N × N. This method calculates cluster contributions (fractional abundances) and radiance values of the N nearest neighbors of each LR pixel to estimate values for the clusters within the moving window.
2.2.2. Unmixing Methodology
Classification of RGB HR Images
The objective of this step is to generate a classification map cl(f,g), where cl denotes the cluster of each pixel (f,g) in the HR image. The number of clusters varies with the size of the moving window to ensure redundancy in the resolution of the linear equation system. The moving windows tested in this study include cross (5 pixels), 3 × 3, 5 × 5, 7 × 7, and 9 × 9. Therefore, initial values of 5 and 9 clusters were used for the algorithms.
Two unsupervised classification algorithms, Iterative Self-Organizing Data Analysis Technique (ISODATA [
25]) and K-means [
26], are applied due to their ability to automatically determine the number of clusters and their initial centers. These algorithms perform well across various image types, whether rural, urban, desert, or forested areas. ISODATA dynamically adjusts the number of clusters based on statistical results of stability and homogeneity, while K-means produces the specified number of clusters in each iteration (
Figure 2).
The classification algorithms are applied to images in the RGB bands from Landsat 8-9, with a spatial resolution of 30 m. This choice is based on the assumption that variations in thermal properties correlate with variations in the reflectivity of the bodies, as supported by previous studies [
27,
28].
Definition of Cluster Contributions
Our approach differs from existing methodologies [
20,
26] by not relying on the Point Spread Function (PSF) of the sensor. Instead, we calculate the cluster contribution based on the values of pixels within a moving window and their respective clusters [
21,
22]. The contribution of cluster
to the signal of the LR pixel
is defined by Equation (1):
where
is the number of HR pixels belonging to cluster
within the moving window, and
is the total number of HR pixels within the window. This approach provides a more versatile and sensor-independent methodology.
Window-Based Unmixing of LR Pixels
The relationship between the number of clusters and the size of the moving window is crucial: more pixels in the window than clusters increase coverage but reduce variability, while an equal number of pixels and clusters increase variability but may lead to less accurate estimates. Therefore, an optimal balance must be found.
In this work, different numbers of clusters have been analyzed to determine an optimal value based on the moving window size. For instance, 5 clusters are used for a cross-shaped moving window (5 pixels), and 9 clusters for larger windows. This approach allows for evaluating and establishing the most effective strategy for achieving accurate and reliable pixel Unmixing results.
Reconstruction of the Unmixed Image
The Unmixing process uses a moving-window step size of 1 pixel. Mean values for each cluster are estimated within the central pixel of the window. For each LR pixel within the moving window, HR pixels within its footprint are identified, and their respective cluster assignments are used to compute the cluster contribution. The signal of each cluster in the unmixed image is estimated by solving a system of linear equations (Equation (2)):
where
is the value of the pixel radiance
in the moving window (the 90 m ASTER pixel in our case), and
(the 30 m pixel) is the estimated signal of cluster
. This approach, termed constrained unmixing, neglects the error term to preserve signal variability within clusters. The explanation and detail of Equation (2) is given in the original paper on the unmixing method [
21].
For matrices 3 × 3, 5 × 5, 7 × 7, and 9 × 9, the distance of each pixel to the center is considered a weight,
, in the radiance estimation (Equation (3)):
The weights are assigned as follows: 1 for the central pixel and its immediate neighbors, 0.99 for pixels in the first distance radius, and 0.98 for pixels in the second distance radius.
Figure 3 shows the weight distribution for LR pixels in cross, 3 × 3, 5 × 5, 7 × 7, and 9 × 9 matrices.
2.3. Case Studies
To assess the Unmixing approaches proposed for improving the spatial resolution of the ASTER TIR images, we selected seven municipalities as case studies: two in Spain (Segovia and Etxalar), two in France (Orange and Marseille), two in Italy (Genova and Ramacca), and one in Finland (Vakka-Suomi), as shown in
Figure 4. These locations were chosen not only due to the availability of both ASTER and OLI/TIRS data from the same dates, ensuring a robust basis for comparison and validation of our methodology, but also to cover a range of different climatic and topographic conditions. The study was conducted for the months of August and September 2022 to capture seasonal variations and ensure comprehensive analysis of the proposed techniques.
In order of appearance in
Figure 4: in Genova (Genoa), Italy, the 243 km
2 area has a Mediterranean climate at an average altitude of 20 m. Temperatures range from 10 °C to 19 °C, with 1100 mm of precipitation per year, and features a coastal city and hilly terrain. Vakka-Suomi in Finland covers 3205 km
2 and has a boreal climate with temperatures ranging from −10 °C to 15 °C and 700 mm of precipitation per year. The area is marked by dense pine and spruce forests, numerous lakes, and wetlands. Marseille, also in France, covers 240 km
2 with a Mediterranean climate at an altitude of 45 m. Temperatures vary from 8 °C to 22 °C, with 500 mm of annual precipitation. The coastal city is known for its rugged coastline and the nearby Calanques National Park. Ramacca, in Sicily, spans 110 km
2 with a Mediterranean climate at 180 m altitude. It experiences temperatures from 10 °C to 24 °C and receives 600 mm of precipitation annually, characterized by rural landscapes and agriculture. Segovia in Spain covers 296 km
2 with a continental Mediterranean climate, averaging 1005 m in altitude. Temperatures range from 5 °C to 15 °C annually, with 500 mm of precipitation per year. The region features pine and oak woodlands and is influenced by the Eresma and Clamores rivers. Etxalar, also in Spain, is a 22 km
2 area characterized by a temperate oceanic climate, with an average altitude of 122 m. Temperatures range between 8 °C and 18 °C, and it receives 1600 mm of precipitation annually. The lush beech, oak, and chestnut forests are complemented by the Bidasoa River. Lastly, in Orange, France, the 128 km
2 area experiences a Mediterranean climate at an altitude of 50 m. With temperatures ranging from 6 °C to 23 °C and 700 mm of precipitation per year, it features the Rhône River and diverse vegetation including vineyards and Mediterranean forests. This diverse range of climatic and topographic conditions across the seven locations enhances the reliability and applicability of our study’s results, providing insights applicable across various environmental settings.
3. Results
In this section, we analyze the results obtained from the study across seven selected case study areas—Segovia, Etxalar, Orange, Marseille, Genova, Ramacca, and Vakka-Suomi. The evaluation encompasses both analytical and statistical methods, as well as visual assessments, to comprehensively assess the effectiveness and reliability of the proposed unmixing technique in enhancing the spatial resolution of ASTER TIR images.
3.1. Statistical Assessment
The quality of the fits was assessed using the coefficient of determination between the TOA radiance values obtained from the synthetic ASTER TIR images and those from the TIRS Landsat 8-9 sensor, considering each classification method, matrix size, number of clusters, and the inclusion or not of the weight matrix. The findings are presented in
Figure 5,
Figure 6 and
Figure 7, where ISO 5 stands for ISODATA classification algorithm [
25] with five clusters, and ISO 5W denotes ISODATA classification algorithms with five clusters and weight coefficients based on distance. Analogously, KER 5 identifies K-means classification [
26] with five clusters as KER 5P is for the same classification with five clusters and weight coefficients. The same logic stands for the other references: ISO 9, ISO 9P, KER 9, and KER 9P.
In all three case studies, the cross-shaped matrix (“cruz”) consistently yielded poor results across all classification algorithms and configurations, with the highest R2 observed being 1.023 in Ramacca.
Similarly, significant discrepancies were noted when employing the K-means algorithm with a 3 × 3 window and five clusters, regardless of matrix weighting; in fact, the matrix weighting yields the worst results for all cases, in comparison with the results from the same number of clusters but no weights. The best performance in this scenario was observed in Marseille, achieving an R2 of 0.846 at its peak.
Conversely, the most favorable outcomes across all case studies were achieved using the K-means algorithm with five clusters, without matrix weighting, and utilizing a 5 × 5 pixel window. This approach resulted in a mean RMSE of 0.159 and MAE of 0.112, taking the seven case studies into account.
3.2. Visual Assessment
After analyzing the results of all configurations for the proposed MMT Unmixing approaches and identifying the most promising outcome, we present below the visual results obtained by applying the optimal configuration according to the statistical analysis (K-means, 5 × 5, five clusters, no weighted matrix) to generate synthetic 30 m ASTER TIR images. These synthetic images are visually compared with both the original ASTER TIR images (90 m resolution) and the 30 m TIRS sensor images from Landsat 8-9 for the same date (
Figure 8,
Figure 9 and
Figure 10). Both the thermal image in single-band gray and in pseudocolor have been displayed. The pseudocolor image allows for easier identification of temperature values, while the single-band gray image helps to appreciate the resolution and sharpness of the results. The case studies of Segovia, Etxalar, and Orange have been selected as representative for this analysis because they present average results in the statistical analysis.
Upon conducting a thorough visual analysis of both the single-band and pseudocolor images, it is evident that the Unmixing process has significantly enhanced the spatial resolution of the radiance data in the thermal bands. The pre-Unmixing images (low-resolution), while informative, displayed a level of blurriness that obscured finer details. Post-Unmixing (high-resolution), however, the images exhibit markedly sharper features and improved clarity. The superior sharpness of synthetic 30 m ASTER TIR images compared to Landsat B10 images is due to ASTER’s higher Modulation Transfer Function (MTF) and the resampling artifacts in Landsat images. ASTER’s higher MTF preserves fine details and contrast, while resampling Landsat 100 m data to 30 m introduces blurring and reduces sharpness. The pseudocolor images, in particular, facilitate a more intuitive understanding of temperature distributions, while the single-band gray images underscore the enhanced resolution and sharpness achieved through the process. This improvement not only aids in more precise temperature mapping but also enhances the overall interpretability of the thermal data, confirming the efficacy of the Unmixing approach in refining spatial resolution.
4. Discussion
In this section, we interpret and analyze the results in relation to the study’s objectives, comparing our findings with previous studies and discussing the implications and limitations of our work. Additionally, we propose ideas for future research related to this topic.
A slight difference was observed between the K-means and ISODATA methodologies, with K-means generally outperforming ISODATA. However, ISODATA surpassed K-means when a maximum of nine classes was specified, as ISODATA consistently generated eight clusters in all case studies, resulting in better performance. This indicates that a lower number of clusters, particularly between five and nine, improves the results. Future work should explore the impact of using fewer clusters.
The step mesh size significantly influenced the results, with the 5 × 5 mesh yielding the best outcomes. This finding raises the question of whether a 3 × 3 mesh with three clusters might have performed even better. Further investigation into the optimal mesh size and cluster number is warranted.
Lastly, the use of weights did not show significant improvement in the results, consistently performing better without them. This suggests that the weight matrix, as implemented in this study, may not be necessary for enhancing spatial resolution in thermal band Unmixing processes. Future research should investigate alternative weighting strategies or the contexts in which weights might prove beneficial.
Overall, our findings demonstrate that careful selection of Unmixing parameters, including the number of clusters and mesh size, can significantly enhance the spatial resolution of thermal band images. These insights contribute to the growing body of knowledge on image Unmixing and provide a foundation for future studies aimed at optimizing remote sensing techniques.
5. Conclusions
This study enhances the spatial resolution of ASTER TIR images using advanced Unmixing techniques, validated against Landsat 8-9 TIRS data. The primary dataset consists of ASTER TIR images from bands B13 and B14, processed to level 1 with terrain correction, alongside high-resolution RGB images from Landsat 8-9 OLI, to improve spatial resolution through the Unmixing Multisensory Multiresolution Technique (MMT). This methodology integrates data across different spectral bands and sensors, optimizing classification methods and moving window sizes (from 3 × 3 to 9 × 9) to refine sub-pixel resolution.
Statistical analysis across multiple case studies reveals that the K-means algorithm with five clusters, a 5 × 5 window, and no matrix weighting yields the best results, achieving a peak R2 of 0.846 in Marseille and mean RMSE of 0.159 and MAE of 0.112, taking the seven case studies into account.
Visual analysis supports these results, showing improved image sharpness and clarity post-unmixing, with better temperature mapping and enhanced interpretability.
In summary, this study advances unmixing techniques for improving ASTER TIR image resolution by effectively addressing atmospheric effects and refining surface temperature accuracy. This has practical implications for environmental monitoring, resource management, and disaster response.
The key contributions of the study are as follows:
Methodological Rigor: The paper delivers a substantial contribution by rigorously developing a methodology to enhance the spatial resolution of thermal infrared (TIR) imagery through unmixing. Unlike previous works, this study provides a comprehensive evaluation of various unmixing strategies, including different window sizes (3 × 3, 5 × 5, 7 × 7, 9 × 9) and classification methods (ISODATA and K-means), with or without weight coefficients.
Optimal Parameters Identified: The research identifies the most effective setup for high-resolution image reconstruction: a 5 × 5 moving window combined with the K-means algorithm without matrix weighting. This precise identification of optimal parameters offers actionable insights for improving TIR imagery applications.
Practical Impact: The enhanced spatial resolution has significant practical applications, from environmental monitoring to emergency response. The study addresses limitations in current satellite data, enabling better detection and analysis of small-scale features.
Contribution to Methodological Rigor: By systematically exploring various unmixing strategies, the paper advances methodological rigor, refining existing techniques and setting a new standard for future research in the field.
Future research could extend these methods to Landsat 8-9 TIRS images and explore integrating high-resolution RGB data from Sentinel-2 to further enhance thermal imaging precision. Further investigation into alternative classification methods and advanced computational algorithms will be essential for continued advancements in spatial resolution enhancement in remote sensing.