A Combination of TsHARP and Thin Plate Spline Interpolation for Spatial Sharpening of Thermal Imagery

There have been many studies and much attention paid to spatial sharpening for thermal imagery. Among them, TsHARP, based on the good correlation between vegetation index and land surface temperature (LST), is regarded as a standard technique because of its operational simplicity and effectiveness. However, as LST is affected by other factors (e.g., soil moisture) in the areas with low vegetation cover, these areas cannot be well sharpened by TsHARP. Thin plate spline (TPS) is another popular downscaling technique for surface data. It has been shown to be accurate and robust for different datasets; however, it has not yet been attempted in thermal sharpening. This paper proposes to combine the TsHARP and TPS methods to enhance the advantages of each. The spatially explicit errors of these two methods were firstly estimated in theory, and then the results of TPS and TsHARP were combined with the estimation of their errors. The experiments performed across various landscapes and data showed that the proposed combined method performs more robustly and accurately than TsHARP.


Introduction
Land surface temperature (LST) is an indispensable parameter for radiation budget estimation, drought monitoring, urban heat island assessment and other global change related studies [1][2][3][4].Due to the technical limitations of instrument design, the spatial resolution of the thermal infrared (TIR) band is unfortunately coarser than that of the visible-near infrared (VNIR) bands for most remote sensing platforms [5].As LST data are required at a relatively finer resolution for many applications, such as urban heat island analysis, precise agriculture monitoring, etc. [5,6], considerable efforts have been devoted to sharpen the thermal imagery using VNIR bands [7,8].Among them, the TsHARP method [9,10] is the most popular because of its operational simplicity and effectiveness [11].The effectiveness of TsHARP is based on the fact that vegetation cover, mainly represented by Normalized Difference Vegetation Index (NDVI), is the most important factor for determining the LST of many landscapes.Thus, this method assumes that the relationship between LST and NDVI is unique across a given imaging scene and independent at different scales.Therefore, the NDVI-LST relationship that has been established at thermal image can be applied to VNIR image, by which the LST image can be sharpened to the higher VNIR resolution.The effectiveness of TsHARP has been demonstrated in previous studies [12,13].
However, recent studies also reported some disadvantages of TsHARP.The scale effect of the NDVI-LST relationship cannot always be disregarded, which could lead to a sizable error for areas dominated by natural vegetation [14].Moreover, NDVI is not the unique parameter that determines LST.A triangular scatterplot in NDVI-LST feature space does commonly exist in many images [15,16].It implies that LST varies largely by other factors (e.g., soil moisture, albedo) in areas with low NDVI values.Although this issue can be addressed to some extent by residual correction, it is still a significant source of errors for TsHARP in heterogeneous areas [13].Therefore, studies have been recently conducted to address these issues.For example, other parameters derived from VNIR bands, such as non-photosynthetically vegetation cover, solar albedo, and Normalized Difference Wetness Index (NDWI), were introduced to estimate LST for thermal sharpening [17][18][19].In addition, more complex models, such as regression tree, artificial neural net, and physical models were applied in the sharpening procedure [5,20,21].Many of the improvements rely on specific designing for certain land cover composite or landscape, which limit their wide applications.A recent effort by Gao [5] shows that regression-tree based technique has a potential to be adaptable for various data and landscapes.However, machine learning techniques might suffer from intensive computation cost and over-fitting risk [22].Therefore, TsHARP remains the standard technique for comparison in the current researches [23,24].
In this study, we attempted to combine another popular downscaling technique, spatial interpolation, to improve the accuracy and robustness of TsHARP.Various spatial interpolation techniques are widely used to enhance the spatial resolution of raster imagery.Among them, thin plate spline (TPS) [25,26] is considered as an effective and accurate method for simple parameter setting, good smoothness and robustness.Accordingly, TPS has been widely employed in spatial interpolation of geo-data, including meteorological data, DEM, etc. [27,28].However, to our knowledge, there has been no attempt to use TPS in thermal sharpening.In this study, we combined the TPS and TsHAPR methods to enhance the advantages of each.The results of TPS and TsHARP are combined with the estimation of their errors in theory.In this manner, TPS result can be assigned with higher weights for the areas where TsHARP works poorly, and vice versa.Combining TPS and TsHARP is expected to improve both the accuracy and robustness of the thermal sharpening technique.The rest of paper is organized as follows.The details of the combined technique of TsHARP and TPS are introduced in Section 2; Section 3 describes a set of experiments for evaluating the proposed method across diverse landscapes; finally, Discussion and Conclusion are summarized in Sections 4 and 5.

Notations and Definitions
For convenience and clarity, some symbols are defined here: n: the number of coarse pixels in low-resolution thermal imagery; m: the number of fine pixels covering the same area as a coarse pixel; N: the number of fine pixels in high-resolution VNIR imagery, which are equal to n × m; (x, y): row and column index of a certain pixel; i: index of a coarse pixel, i = 1, …, n; j: index of fine pixels in each coarse pixel, j = 1, …, m; T high (x ij , y ij ): True LST at the fine pixel (x ij , y ij ); T low (x i , y i ): LST at the coarse pixel (x i , y i ); NDVI high (x ij , y ij ): NDVI at the fine pixel (x ij , y ij ); NDVI low (x i , y i ): Aggregated NDVI at the coarse pixel (x i , y i ).

TsHARP Method
TsHARP assumes that there is a unique NDVI-LST relationship exists within a remotely sensed scene at multiple spatial resolutions.Firstly, an empirical relationship between NDVI and LST at the coarse thermal resolution is established.
where a and b are the regression coefficients estimated via the least-squares method, and ε reg (x i , y i ) is the residual field at coarse pixel (x i , y i ).Only the simplest linear regression is considered because a suitable selection of LST proxy and regression tool is still problematic [11].The regression equation is then applied to the fine pixels, and the LST at high spatial resolution is estimated as: Next, the final estimate of LST (T TsHARP ) is derived by adding the coarse-resolution residual field back into the regression estimate: TsHARP has a high accuracy in vegetation dominated areas; however, it might produce significant errors in low NDVI areas, such as urban areas, where LST could vary greatly as a result of the change of soil moisture and other factors.

Thin Plate Spline (TPS) Interpolation
TPS is a spatial interpolation technique for point data, which is also widely used for downscaling various geo-data.The basic principle of TPS is based on the spatial dependence of geo-data.Accordingly, it firstly fit a spatial dependent function that interpolates the known point data and minimizes the bending energy [24].Then, data at any point on the surface can be interpolated by the fitted function.The basic TPS function is defined as: with the following constrains: where , and the coefficients are determined for satisfying: The TPS function can be fitted in a local spatial window or the entire image.In order to reduce the computation cost, TPS function in this study was fitted in the moving window of 5 × 5 coarse pixels.The TPS function was then used to acquire LST at high resolution without VNIR data input.
Due to the fact that TPS interpolation considers only the spatial dependence of LST, it produces a result with a smooth spatial pattern in accordance with that coarse data exhibit.Compared to TsHARP, TPS cannot reconstruct the rich spatial details of LST image; however, it can produce a better result in the areas where NDVI poorly indicates LST and LST displays somewhat spatial continuity.

Combination of TsHARP and TPS
In order to enhance the advantages both of TsHARP and TPS, we developed a method which combines TsHARP and TPS.First, two high-resolution LST images were acquired by the sharpening of the regression method (Equation (2), TsHARP without residual correction) and TPS, respectively.Secondly, the Error of Regression Method and TPS were estimated respectively for each coarse pixel.Third, the weighted results of the regression method and TPS were combined with the estimation of their errors.Finally, new residual fields were added back to the weighted results.The flowchart is shown in Figure 1 and detailed procedures are given below.

Error Estimation of the Regression Method
For the regression method, the error variance of regressed LST in coarse pixel (x i , y i ) can be calculated as the square of a coarse-resolution residual field if the error is assumed to be constant in a coarse pixel (x i , y i ) A larger coarse-resolution residual square in Equation ( 8) indicates a larger bias to the regression model for this coarse pixel, consequently it is a good estimate of error variance for the regression method.

Error Estimation of TPS
In general, TPS produces smoother LST imagery with a lack of spatial details because only spatial dependence is considered.Therefore, the spatial variance of downscaled LST by TPS is generally lower than that of true LST at same spatial scale.Assuming that total variance of true LST can be where: Equations ( 10) and (11) are the total variance and model variance of LST in coarse pixel (x i , y i ) respectively.The absolute operation in Equation ( 9) is to address the abnormal case that total variance is less than model variance.As the true high-resolution LST (T high (x ij , y ij )) is unknown, the total variance cannot be calculated directly.However, it can be estimated from the NDVI-LST regression model, as: where: x y x y x y m    (13) and Var(ε reg ) is the residual variance in the NDVI-LST regression model at low resolution.Combined with Equation ( 9), the error variance of TPS at each coarse pixel is: Equation ( 14) means that the variance difference of the LST derived from NDVI-LST model and TPS can reflect the error of TPS.Such error estimation is empirical, however reasonable.In general, spatial variance of the LST sharpened by TPS is relatively small.If the NDVI variance in a coarse pixel is large, TPS cannot produce a good result because NDVI is the main factor determining LST.In contrast, if the NDVI variance is small, TPS produces better result as spatial dependence plays a more important role than NDVI in determining LST distribution.Especially, in the areas with low NDVI values, NDVI-LST relationship works poorly and spatial dependence should be considered more.Therefore, the proposed error estimation can reflect error of TPS to some extent.

Combination of TPS and Regression Method
Considering that Equations ( 8) and ( 14) are the estimates of the sharpening error variance of TPS and the regression method, the weights of these two methods are calculated for each coarse pixel: The sum of the two weight values is equal to one.Based on Bayesian principle, the weighted summation of the regression method and TPS is the best estimate [29]: Finally, a new coarse-resolution residual field is added back to the result: 3. Experiments

Simulation Experiment Using ASTER Data
To investigate the effectiveness of the proposed combined method, we conducted a simulation experiment using subset images of ASTER level-1B (radiometric corrected VNIR bands) and level-2B03 (LST) products acquired of Yokohama city, Japan on 25 April 2004.Figure 2a shows the false color composite of this ASTER VNIR image, exhibiting very heterogeneous landscape.The low-resolution LST data was simulated by upscaling the LST image from the original 90-m resolution to 720-m resolution (Figure 2c).The upscaling was conducted by arithmetic averaging because other methods based on Stefan-Boltzmann law do not show obvious superiority [30].Meanwhile, the NDVI image was also linearly resampled from 15-m to 90-m (Figure 2b).Then a simulation experiment of sharpening LST from 720-m to 90-m was conducted, and the original 90-m ASTER LST image served as the true high-resolution LST data for validation.3a) with the sharpened images by the combined method, TsHARP, and TPS (Figure 3b-d).TsHARP reconstructed rich spatial details of the LST image (Figure 3c), while TPS produced a much smoother spatial pattern (Figure 3d).In general, the result of TsHARP was more similar to the true LST image than that of TPS.However, the regression residual increased, both the RMSEs of the two methods also increased, while the RMSE of the proposed method increases in a slower manner.This indicates that the proposed method improves the sharpening accuracy of TsHARP for the areas where the residual of regression model is larger.This result further confirms that our proposed weights for the combination of the regression method and TPS were reasonable.

Simulation Experiments across Different Landscapes
To evaluate the robustness of the proposed method, more simulation experiments were conducted using different datasets.In addition to the ASTER imagery used in the Section 3.1, additional three satellite images acquired in different landscapes were used for the experiments.The information of the used satellite images for the new experiments is shown in Table 1 and Figure 5.For the Landsat ETM+ data, LST was calculated from thermal infrared band using the Mono-window algorithm [31]     Figure 6 shows some examples of the sharpening results (360-m to 90-m for ASTER and 240-m to 60-m for ETM+) by TsHARP and the combined method.Through the images of the sharpening error, it is found that the combined method produced smaller error than TsHARP especially for the water area.The river traces in the error images were largely reduced by the combined method.Significant differences are also found in the quantitative accuracy assessment.As shown in Figure 7, the proposed combined method showed lower RMSE than TsHARP for various landscapes and resolution-combinations with the exception of cropland landscape.For the cropland, the accuracy of the combined method was sometimes slightly lower than TsHARP.As the TsHARP method is particularly suitable for cropland [12,13], it acquires much higher accuracy for cropland than for other landscapes.Therefore, the potential of improvement by the combined method is very limited for this case.In summary, these experiments across various landscapes and resolution-combinations showed that the proposed method performed more accurately and robustly than TsHARP.

Application of MODIS Data
In addition to the simulation study, an application experiment using MODIS data was also conducted to evaluate our method.250-m MODIS NDVI and 1000-m LST images acquired in Yokohama city on 25 April 2004, (Figure 8a,b) were used for thermal image sharpening.Two sharpened 250-m resolution LST images were acquired by the proposed method and TsHARP, respectively, shown in Figure 8c,d.Then, a 250-m resolution LST image (Figure 8g) linearly resampled from 90-m ASTER LST image was served as the true LST image.The validation was conducted on the reprojected images of the corresponding subset of sharpened MODIS LST images (Figure 8e,f).
As shown in the scatterplots of two sharpened LST versus true LST (Figure 8h,i), the proposed combined method produced more accurate results than TsHARP.The RMSE of the proposed method was 2.38 K, while the RMSE of the TsHARP method was 2.61 K.This experiment confirmed that the proposed method outperforms the TsHARP not only in simulation experiments but also in practical application.

Discussion
The effectiveness of TsHARP relies on the good correlation between NDVI (or vegetation cover) and LST.However, the NDVI-LST relationship does not always endure in all the situations.A triangular is often found in scatterplot of NDVI and LST.As shown in the schematic of Figure 9 (revised from Figure 2 in [15]), the residuals of NDVI-LST regression model increase when the NDVI decreases because the evaporation vary greatly based on the moisture levels in the areas with low NDVI.As an extreme example, water area with low NDVI exhibit low LST, corresponding to the wet edge in Figure 9.For this case, TsHARP could produce large sharpening errors.TPS interpolation is a widely used downscaling technique for geo-data.It requires no assumptions, and performs robustly under a wide variety of situations.When prior knowledge is unavailable (e.g., NDVI-LST relationship does not work), TPS can acquire a comprise result without large errors although the spatial details cannot be reconstructed accurately.Thus, TPS could be a good supplement of TsHARP in areas where NDVI poorly indicates LST.For this reason, we combined TPS and TsHARP based on their error estimation in theory.The method with the larger error estimation is assigned with smaller weight; therefore, TPS is assigned with high weights in the areas where TsHARP works poorly for avoiding large sharpening error.It shows that the combined method outperforms TsHARP in most experiments.The exception is cropland landscape where TsHARP's performance is slightly better because the LST spatial pattern correlates well with NDVI.Accordingly, the combined method can only achieve similar accuracy with TsHARP in cropland landscape.In summary, the proposed combined method performs more robustly than TsHARP across various landscapes.
The proposed error estimation method was derived using some approximations.For example, the small error for a model at coarse resolution might correspond to a large error at fine-resolution.Moreover, the error estimation of TPS was acquired with empirical manner.However, the improvement of the combined method could prove the reasonability of the proposed error estimation indirectly.More advanced statistical model is encouraged to be developed for estimating the sharpening error of TsHARP and TPS interpolation in the future, by which the thermal sharpening result could be further refined.Other than TPS, there are many spatial interpolation methods.Bilinear and cubic convolution interpolations are commonly used for resampling remote sensing data.Although the smoothness of the resampled images by these two methods is slightly weaker than that by TPS, they are also acceptable in most applications.Kriging is another popular interpolation technique that uses the variogram function.Although variogram analysis is helpful to understand the spatial dependence [32], the operation of Kriging interpolation is relatively complex because human intervention is required for the training of variogram model.It is expected that the sharpening accuracy could be further improved if new spatial interpolation technique specifically designed for thermal sharpening is developed.

Conclusions
In this study, a combined method of TsHARP and TPS (thin plate spline) interpolation was proposed for improving the accuracy and robustness of TsHARP for thermal sharpening.A set of sharpening experiments confirm the effectiveness of the proposed combined method based on error estimation.Without further parameter setting and assumptions, the proposed combined method is expected to be an effective thermal sharpening technique in the applications.Moreover, this study shows a potential of developing theoretical framework for combing different thermal sharpening methods.If the spatially explicit error of different methods can be reasonably estimated, the combined output is expected to be a better sharpening result.

Figure 1 .
Figure 1.Flowchart of the proposed combined method of TsHARP and TPS.

Figure 3
Figure3compares the true 90-m resolution LST image (Figure3a) with the sharpened images by the combined method, TsHARP, and TPS (Figure3b-d).TsHARP reconstructed rich spatial details of the LST image (Figure3c), while TPS produced a much smoother spatial pattern (Figure3d).In general, the result of TsHARP was more similar to the true LST image than that of TPS.However,

Figure 4 .
Figure 4. Relationship between Residual of NDVI-LST regression and sharpening RMSE by the proposed method and TsHARP.
. The simulation sharpening experiments were conducted from different initial resolutions to target resolutions.For ASTER data, the resolution combinations (initial resolution to target resolution) included 1440 m to 90 m, 1440 m to 180 m, 1440 m to 360 m, 720 m to 90 m, 720 m to 180 m, and 360 m to 90 m.For Landsat ETM+ data, the resolution combinations included 960 m to 60 m, 960 m to 120 m, 960 m to 240 m, 480 m to 60 m, 480 m to 120 m, and 240 m to 90 m.

Figure 5 .
Figure 5. Images and locations of the study areas.(a) Grassland in Inner Mongolia; (b) Cropland in Haihe river basin; (c) Rural area in Aichi prefecture; (d) Urban area in Yokohama city.

Figure 6 .
Figure 6.Thermal sharpening experiments (360-m to 90-m for ASTER and 240-m to 60-m for ETM+) in (a) grassland in Inner Mongolia; (b) cropland in Haihe river basin; (c) rural area of Aichi prefecture; and (d) urban area in Yokohama city.

Figure 7 .Figure 8 .
Figure 7. RMSEs of the sharpening experiments of proposed method and TsHARP for different landscapes and resolution-combinations.(a) Grassland in Inner Mongolia; (b) Cropland in Haihe river basin; (c) Rural area in Aichi prefecture; (d) Urban area in Yokohama city.

Figure 9 .
Figure 9. Schematic of triangular in feature space of NDVI and LST.
model variance and error variance, the error variance of sharpened LST in coarse pixel (x i , y i ) can be calculated as Equation (9):

Table 1 .
Information of the satellite data for experiments.