An Efficient Approach for Pixel Decomposition to Increase the Spatial Resolution of Land Surface Temperature Images from MODIS Thermal Infrared Band Data

Land surface temperature (LST) images retrieved from the thermal infrared (TIR) band data of Moderate Resolution Imaging Spectroradiometer (MODIS) have much lower spatial resolution than the MODIS visible and near-infrared (VNIR) band data. The coarse pixel scale of MODIS LST images (1000 m under nadir) have limited their capability in applying to many studies required high spatial resolution in comparison of the MODIS VNIR band data with pixel scale of 250–500 m. In this paper we intend to develop an efficient approach for pixel decomposition to increase the spatial resolution of MODIS LST image using the VNIR band data as assistance. The unique feature of this approach is to maintain the thermal radiance of parent pixels in the MODIS LST image unchanged after they are decomposed into the sub-pixels in the resulted image. There are two important steps in the decomposition: initial temperature estimation and final temperature determination. Therefore the approach can be termed double-step pixel decomposition (DSPD). Both steps involve a series of procedures to achieve the final result of decomposed LST image, including classification of the surface patterns, establishment of LST change with normalized difference of vegetation index (NDVI) and building index (NDBI), reversion of LST into thermal radiance through Planck equation, and computation of weights for the sub-pixels of the resulted image. Since the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) with much higher spatial resolution than MODIS data was on-board the same platform (Terra) as MODIS for Earth observation, an experiment had been done in the study to validate the accuracy and efficiency of our approach for pixel decomposition. The ASTER LST image was used as the reference to compare with the decomposed LST image. The result showed that the spatial distribution of the decomposed LST image was very similar to that of the ASTER LST image with a root mean square error (RMSE) of 2.7 K for entire image. Comparison with the evaluation DisTrad (E-DisTrad) and re-sampling methods for pixel decomposition also indicate that our DSPD has the lowest RMSE in all cases, including urban region, water bodies, and natural terrain. The obvious increase in spatial resolution remarkably uplifts the capability of the coarse MODIS LST images in highlighting the details of LST variation. Therefore it can be concluded that, in spite of complicated procedures, the proposed DSPD approach provides an alternative to improve the spatial resolution of MODIS LST image hence expand its applicability to the real world.


Introduction
MODIS thermal infrared (TIR) band data are mainly used to retrieve land surface temperature with such techniques as split window algorithms for the study of land surface energy budgets [1,2], water resource management [3,4], agricultural drought [5,6], and environmental biogeochemistry processes [7]. However, spatial resolution of the data is 1000 m under nadir, which is very low in comparison with that of visible and near-infrared band data (for example, 250 m for bands 1 and 2). The retrieved LST images from MODIS TIR data are also with much low spatial resolution (1000 m under nadir), which has limited their applications in many studies requiring high spatial resolution to identify detailed variation of thermal heat flux over the region under study [8]. Therefore, it is very necessary to increase the spatial resolution of MODIS LST images in order to meet the needs of many studies [9,10] requiring pixel scale to identify the details of LST variation under complicated environments [11,12].
To increase the spatial resolution of an image actually means to decompose its pixels into smaller ones. A number of studies have been done for visible and infrared band data [13][14][15]. Data fusion has been a well-known technique to integrate multi-sensors or multi-sources image data with different spatial resolutions (hence pixel scales) into an image with better pixel scale [16,17]. Gao et al., proposed the spatial and temporal adaptive reflectance fusion model (STARFM) for fusing Landsat and MODIS surface reflectance data to produce a synthetic "daily" surface reflectance product at the ETM + spatial resolution [18]. Hilker et al., improved the model through adding the day of disturbed information, extracted from the time-series MODIS images, to reduce the possible errors in the resulting fused image [19].
Several attempts have also been made to increase the spatial and temporal resolutions of thermal infrared imagery or LST data [20][21][22]. Guo and Moore developed a pixel block intensity modulation (PBIM) method to improve 120 m spatial resolution of Landsat TM thermal band 6 image into 30 m through integrate topographic details in reflective spectral bands for each thermal pixel block. The method improves the thermal variation caused by topography to 30 m resolution while maintaining the average thermal digital number (DN) unchanged [22]. Liu and Pu developed two methods to downscale the coarse resolution TIR radiance for sub-pixel temperature retrieval [23]. By considering the relationship of LST or emissivity between NOAA-AVHRR and Landsat TM data, Stathopoulou and Cartalis improved the PBIM method by defining a scaling factor to downscale AVHRR LST [24]. Nichol developed an emissivity modulation with a simplified assumption that thermal emissivity was uniform within the low-resolution pixels to decompose the images with coarse pixel scale into fine one using the emissivity estimated from the auxiliary data with the same pixel scale as the decomposed one [25]. Recently attempts have been made to use the spectral unmixing model and artificial neural network for disaggregation of the TIR or LST data [20,21].
Due to very high temporal resolution, MODIS data have been widely used for various studies. The low spatial resolution of MODIS thermal band data have been a major obstacle for application of the data to many studies such as urban heat island monitoring and agro-drought monitoring that required high spatial resolution to highlight details of spatial variation in the region under study. Several efforts have been made to down-scale the coarse pixel of MODIS thermal infrared band data into better spatial resolution [11]. The down-scaling was generally done through the well-known LST-NDVI relationship. Kustas et al., developed an approach called DisTrad method to establish a simple linear regression equation between LST and NDVI to decompose MODIS LST image into a better spatial resolution [26]. Essa et al., improved the evaluation DisTrad (E_DisTrad) method for urban areas [27].
NDVI is not only one of the factors affecting variation of LST in the real world [12]. Other relevant indices have also been used for decomposition of MODIS thermal band data [27,28]. An approach using fractional vegetation cover instead of NDVI was developed in Agam et al., for thermal band data sharpening [29]. Stathopoulou and Cartalis used the intensity of surface urban heat island to downscale the AVHRR LST images into the same spatial resolution as Landsat TM 6 band data [24]. Essa et al., compared the applicability of 15 different indices for pixel decomposition of MODIS LST images and found that the correlation between LST and the impervious percentage was the best in urban areas [30].
Zakšek and Oštir decomposed the pixels of MODIS LST images through principal components and regression equation within a moving widow over the assistant image with high resolution [12]. Jing and Cheng used a non-linearly transformation to produce the maximal correlation between MODIS TIR bands and multiple reflective bands for pixel decomposition of MODIS LST images [28]. These attempts of pixel decomposition usually are able to generate a result that is acceptable in image interpretation of thermal infrared data. However, they are not able to maintain the thermal infrared radiance of the pixels unchanged after decomposition.
Objective of this paper is to develop an efficient approach for pixel decomposition of MODIS LST image to increase its spatial resolution while the thermal radiance of sub-pixels in the resulted decomposed LST image remains as the same of the parent pixel from which the sub-pixels are generated. How to maintain the radiance of LST image pixels unchangeable while decomposing them properly is the core breakthrough of this study. Since two steps are involved in our proposed approach for pixel decomposition of MODIS LST image: initial temperature estimation for the sub-pixels in the resulted LST image and final temperature determination through weighting approach to maintain the thermal radiance unchanged after decomposition, we term it the double-step pixel decomposition (DSPD). After describing the theoretical basis and the image processing procedures of the DSPD approach, we designed an experiment to validate it through comparison with evaluation DisTrad (E-DisTrad) method developed by Essa et al. [27], which is very close to our DSPD method in the first step of pixel decomposition. Therefore, the paper is organized as a methodology section presenting the details of the DSPD approach and a result section to validate the approach as well as a conclusion to summarize the key points of the study.

Background
To increase the spatial resolution of a MODIS LST image actually means to decompose its pixels into smaller ones with relative high spectral multispectral image as assistance. Since ASTER was on board the same Terra platform as MODIS for Earth observation, the spatial resolution of MODIS LST can be decomposed from 1000 m into 250 m with its VNIR data or 90 m with ASTER visible/infrared (VNIR) data. The LST observed by ASTER can be the true value to validate the MODIS LST decomposed result, so the LST product and multi-spectral data of MODIS and ASTER are chosen as our experiment data.

Dataset
For the application site of Washington (Figure 1), the MODIS LST product (MOD11A1), MODIS emissivity product (MOD11A1, MOD11B1), ASTER LST product (AST_08), ASTER emissivity product (AST_05) and ASTER multi-spectral data (AST_L1B) are collected. A pair of MODIS and ASTER images were obtained for this experiment. The images were acquired at 16:03 Greenwich time (12:03 US east time) on 24 August 2003. The MODIS image covers the eastern region of USA (Figure 1a), while the ASTER image only occupies a small part of the MODIS image due to its very high spatial resolution. Figure 1b shows the ASTER image. After geometric correction, the pixel scale of the MODIS image is 1000 m while the scale of ASTER image is 90 m.
In order to validate the approach, a small typical subset of image covering the same region was made from the MODIS and ASTER images. Figure 1c shows the subset from MODIS image with 50 × 50 pixels and Figure 1d is the subset from ASTER image with 550 × 550 pixels. The subset covers parts of Washington DC and Maryland region, with Chesapeake Bay in its east. Three surface patterns can be easily identified in the subsets: urban region, water surface and natural terrain.

Pre-Processing of the Dataset
Since the spatial resolution and data format are different between MODIS LST product and VNIR data, the dataset need to do some pre-processing before they can be used. The coordinate reference system of MODIS product need to be converted to the UTM/WGS84 same as the ASTER data. Then an image-to image georeferencing is performed by collecting many dispersed ground control points (GCPs) throughout each image and applying 1st-order polynomial transformation to match the ASTER data with MODIS using the nearest neighbor resampling method. Then the co-registered images are overlaid over one another in order to examine potential pixel misalignement caused by different spatial resolution of MODIS and ASTER. Indeed, some misalignment was visually detected, which is further corrected by the upper-left corner of the ASTER image [24].

Theoretical Principle of DSPD
The TIR band data of remote sensing system such as MODIS are usually with relative much lower spatial resolution than the reflective ones (usually in VNIR spectral region) on the same system due to the fact that TIR data were obtained through detecting the ground-emitted thermal radiance which is much weaker than the ground-reflected solar irradiance which becomes the energy for optical imaging system according to spectral theorem described by Planck equation. As to MODIS, the LST product retrieved from MODIS TIR data (bands 29-36) are with a spatial resolution of 1000 m under nadir viewing while VNIR data (bands 1 and 7) are with the resolution of 250 m for bands 1 and 2 or 500 m for bands 3-7. This coarse pixel scale of the MODIS LST images has limited their application to environmental issues requiring high spatial resolution to detect the exiguous variation of thermal properties.
One way to solve the problem of low spatial resolution of MODIS LST images is to increase their spatial resolution through so-called pixel decomposition techniques. In this paper we intend to develop a new approach for the decomposition of MODIS LST images. Since the existing methods for decomposition of thermal infrared data and its retrieved LST images does not keep the thermal radiance of pixels unchanged after they are decomposed, our goal in the study is to break through this limit to maintain the thermal radiance in the decomposed image equal to their parent pixel from which they are generated.
Decomposition of MODIS LST images usually requires assistance of auxiliary data. In our case, the available auxiliary data covering the same region as the LST image are the MODIS VNIR band data such as bands 1 and 2 with pixel scale of 250 m or bands 3-7 in 500 m. Therefore, our specific goal is to develop an approach for pixel decomposition to increase the 1000 m spatial resolution of MODIS LST images into the same scale of 250 m using the available MODIS VNIR band data (e.g., bands 1 and 2) as auxiliary data. Figure 2 illustrates the main theoretical principle of our approach for pixel decomposition. In order to divide the 1000 m MODIS LST image pixel into the decomposed sub-pixels of 250 m scale and to maintain the thermal radiance of the parent pixel unchanged, we have to use the auxiliary data to compute a weight for each of the decomposed sub-pixels so that the total thermal radiance of the parent pixel can be distributed among the sub-pixels. The final result of decomposed LST image can be computed using the relationship between thermal radiance and surface temperature for each sub-pixel. Therefore, for a parent pixel i in the MODIS LST image (Figure 2), we can use the known land surface temperature (Ts) to compute its thermal radiance R(Ts) as follows using the Planck equation formula about the radiance at a given wavelength range [31][32][33]: where R(Ts) is the thermal radiance with land surface temperature Ts at given wavelength range (W· m −2 ) of the parent pixel i in MODIS LST image; εi is the emissivity of the pixel at given wavelength range λ1-λ2 (μm); Rb(Ts) (Unit, W· m −2 ) is the blackbody thermal radiance at a given wavelength range λ1-λ2 (μm). Ts is the land surface temperature (K); C1 and C2 are the radiance constants with C1 = 1.191 × 10 8 W· μm 4 · sr −1 · m −2 and C2 = 1.439 × 10 4 μm K respectively.
For convenient calculation and model, the relationship between thermal radiance of blackbody (Rb) and temperature (Ts) in specific wavelength (e.g., 8-13.5 μm, 10.78-11.28 μm) range will be found: where K1 and K2 are the coefficient of thermal radiance and temperature of the blackbody at different wavelength range given in Table 1. If we are able to find a radiance weight for each sub-pixel in the decomposed LST image, the thermal radiance of the sub-pixels can be easily computed as follows: where Rd_k is the thermal radiance of sub-pixel k, and Wk is the weight of the sub-pixel k, R(Ts) is the thermal radiance of the parent pixel. Since the pixel scale of auxiliary data (e.g., MODIS NVIR bands 1 and 2) is 250 m, the number of the decomposed sub-pixels for the parent pixel should be 16. In order to maintain the thermal radiance of the parent pixel i unchanged, we have: Then the LST of each sub-pixel in the decomposed LST image can then be computed with Equations (1) and (2) as follows: where Td_k is the decomposed LST for the sub-pixel k in the decomposed LST image.
In order to use Equation (5) to compute the land surface temperature for the decomposed pixels, one has to determine the weight and the emissivity for each sub-pixel. This can be done through the auxiliary VNIR band data.

Determination of Weight for the Pixel Decomposition
The radiance weight is needed to the computation of thermal radiance for each sub-pixel in the decomposed LST image. Then the LST of each sub-pixel could be calculated by its decomposed thermal radiance. Since we are not able to directly obtain the true decomposed temperature for determination of the weight to decompose the parent pixel, we use other estimated temperature for computation of weight for the sub-pixels and we term the estimated temperature at this stage the initial temperature for the sub-pixels.
If we can estimate the initial decomposed temperature (Tk) and emissivity (εk) for each sub-pixel of the decomposed LST image as shown in Figure 2, we are able to compute the thermal radiance at given wavelength of each sub-pixel as follows using Planck equation formula about the radiance as follows: where Rk is the estimated thermal radiance for sub-pixel k, Tk and εk are the estimated initial temperature and emissivity of the sub-pixel. Therefore the total thermal radiance of the sub-pixels can be estimated as follows: where Rtot is the total thermal radiance of the decomposed sub-pixels from the parent pixel. In order to meet the Equation (4) to maintain the thermal radiance of the parent pixel unchanged after decomposed, the weight for each sub-pixel can be determined as follows: Therefore the importance for weight determination is to properly estimate the emissivity and the initial temperature for each sub-pixel k.

Estimation of Emissivity for the Sub-Pixels
It is well known that emissivity of an object is mainly determined by its thermo-physical characteristics of the object. For the ground surface, the components composing of the surface are the main factors determining the ground emissivity [32,34,35]. Many effective methods have been approved to obtain the emissivity, e.g., the Day/Night method [36], grey body emissivity method [37], et al., Since the emissivity at different spatial resolution is needed in the decomposition process, we should find a method to estimate the emissivity of MODIS at different resolution. So the NDVI thresholds method (NTM) [38] is chosen to estimate the emissivity respectively at resolution of 250 m and 1000 m.
While the emissivity is variable with the wavelength, the NTM can estimate the emissivity of different land surface accurately at 10-12 μm range, so we choose the wavelength of band 31 (10.78-11.28 μm) for emissivity estimation and thermal radiance calculate based on the MODIS band wavelength design. At this wavelength range, the emissivity can be modeled as follows [ Subject to: where ελ is the band emissivity, εvλ and εsλ are respectively the vegetation and soil emissivity, Pv is the proportion of vegetation, C is a term due to surface roughness (C = 0 for flat surface), NDVIv and NDVIs are the NDVI for a fully vegetated pixel and soil one respectively, F' is a geometrical factor ranging between zero and one. Usually vegetation cover fraction at pixel scale can be computed from its NDVI as follows [39,40]: Over particular areas, NDVIv and NDVIs values can be extracted from the NDVI histogram. Values of NDVIv = 0.5 and NDVIs = 0.2 were proposed by in to apply the method in global conditions [38]. While the value for vegetated surfaces (NDVIv = 0.5) may be too low in some cases, for higher resolution data over agricultural sites, the NDVIv can reach 0.8 or 0.9 [41].

Estimation of the Initial Temperature for the Sub-Pixel
Since the core of our approach for pixel decomposition is to divide the LST of the parent pixel into the decomposed sub-pixels, the temperature of the sub-pixel is the final result of the decomposition that we intend to obtain from the decomposition. Thus, we are not able to use the true decomposed temperature for determination of the weight to decompose the parent pixel. We have to use other approaches to estimate the temperature for computation of weight for the sub-pixels and we term the estimated temperature at this stage the initial temperature for the sub-pixels.
It has been well known that land surface temperature was determined by the surface structure shaping the thermal performance of the ground surface [26,42,43]. Vegetation has been recognized as the most important factor governing the performance of LST variation at regional scale [29]. Thus the relationship between vegetation index and LST has been extensively examined for various applications to such issues as drought and evaporation [4,5], urban heat island [20] and spatial sharpening of thermal imagery [29].
In the study we also use the relationship between NDVI and LST to estimate the initial temperature of the sub-pixels for weight determination. Among the three patterns of ground surface, natural terrain has the most obvious feature of vegetation dynamics. Thus the relationship between NDVI and LST is mainly used to estimate the initial temperature for the sub-pixels categorized as the pattern of natural terrain. The establishment is done over the MODIS LST image and the re-sampled NDVI image retrieved from the auxiliary VNIR band data to match the pixel scale of LST image at 1000 m. Therefore we have the relationship for the initial temperature estimation as follows: Tns = an + bn NDVIr (12) where Tns is the LST for the pixels of natural terrain in the LST image, NDVInr is the vegetation index in the resampled NDVI image of the pixels, an and bn are the regression coefficients between LST and the NDVI. This regression is usually with a standard error of estimation (SEE) due to the fact that the relationship is not actually with a rigorous linearity but with a vibration on both sides. Therefore we improved the relationship by consideration of this estimate error and then applied to the retrieved NDVI image for the estimation of initial temperature of the sub-pixels as follows: where Tnk is the estimated initial temperature of the sub-pixels classified as natural terrain pattern; NDVIns is the NDVI of the sub-pixels computed from the auxiliary VNIR band data; an and bn are regression coefficients obtained from Equation (18), Rn(T) is a function to generate random number using computer clock as the initial for randomization. Since regression of the relationship between LST and NDVI is generally with a SEE, the inclusion of the random number by the function is conFigured to be as follows: where ASEEn is the adjusted SEE, which should be as a function of NDVI with a maximum of ±3 K according to our experiences. This is because the vibration of LST for the same NDVI level is with a trend of increasing for a decreasing of NDVI in magnitude. Therefore, for sub-pixels with high value of NDVI, the LST is generally low and with a vibration to be also small. As a contrast, the LST is usually not only very high but also with high vibration for the pixels with a low value of NDVI ( Figure 3). Accordingly we have the function of ASEEn to NDVI for the natural terrain pattern as follows: Though natural terrain occupies most of the ground in pixel scale, urban region and water are also the common surface patterns that we encounter in remote sensing images. This is especially true when high spatial resolution images are under studies. For the VNIR band data with pixel scale of 250 m, the two surface patterns also frequently appear. To estimate the initial temperature for the two patterns, we follow the same methodology as done for natural terrain. Instead of NDVI, the normalized difference of building index (NDBI) and water color index (WCI) are used for the estimation of initial temperature, respectively as follows: where Tmk and Twk are the estimated initial temperature for the sub-pixels of building surface and water bodies, NDBIs and WCIs are the normalized difference of building index and water color index for the two surface patterns respectively, am and bm are the regression coefficients between LST and NDBI, aw and bw are the regression coefficients between LST and WCI, Rm (T) and Rw (T) are the random function for the two surface patterns, which can be given according to the magnitude of their surface temperature vibration as follows: Thus we have the initial temperature determination as follows:  (20) where Tk is the estimated initial temperature for weight determination in our approach.

Estimation of the Important Parameters for the Approach
Three important parameters are required to determine for the approach: NDVI, NDBI and WCI. Since we use MODIS bands 1 and 2 as our auxiliary data for the decomposition of LST image, the parameter NDVI for the auxiliary data can be computed as follows: where B2 and B1 are the pixel value of MODIS bands 2 and 1 respectively. Atmospheric correction should be done to the two band data before they are used to compute the NDVI. The normalized difference of building index (NDBI) and water color index (WCI) can be computed as follows: 2 (23) where B6 is the pixel value of MODIS band 6, and ND2 is the DN value of water pixels in MODIS band 2. Therefore, the value of WCI ranges from 0-1.0. The color of water tends to be light as WCI increases.

Procedures of the Approach for Pixel Decomposition
The procedures to conduct pixel decomposition with the approach can be summarized as follows: (1) Preparing the required data. This includes the MODIS LST and emissivity product with pixel scale of 1000 m for the pixel decomposition and the required auxiliary data which is the MODIS VNIR band data (e.g., bands 1 and 2) with pixel scale of 250 m, and band 6 with pixel scale of 500 m. Resampling MODIS band 6 to pixel scale of 250 m matching that of bands 1 and 2. Geometric correction is required to be done for the LST image and the VNIR band data so that they are coordinately matched with each other to cover the same geographical region. (2) Classifying the auxiliary data. A classified image with three surface patterns for the pixels is generated: natural terrain, urban region, and water body. (3) Estimating the essential parameters. This is to compute the four essential parameters (e.g., NDVI, NDBI, WCI, and Pv) from the auxiliary data, resulting in four corresponding parameter images with pixel scale of 250 m. The parameter images are then re-sampled to the pixel scale of 1000 m. (4) Estimating emissivity. This is done to the auxiliary NDVI data (250 m) with the NTM, resulting in an emissivity image at the wavelength of 10.78-11.28 μm required to compute the weight. (5) Establishing regressions equations for the 3 surface patterns. This includes to take a samples of pixels from the MODIS LST image and the re-sampled image of the corresponding parameter at pixel scale of 1000 m and then to carry on regression analysis between LST and the corresponding parameters to determine the coefficients of the regression equations. (6) Estimating the initial temperature. This is to apply the regression equations obtained in step 5 to the auxiliary data with pixel scale of 250 m, using the classified image and the parameter images as assistance. (7) Determining the weight. This includes to compute the thermal radiance from the initial temperature image and the emissivity image with pixel scale of 250 m and to summarize thermal radiance for each block with 4 × 4 pixels in order to match the pixel scale of the MODIS LST image. (8) Computing thermal radiance of the sub-pixels with scale of 250 m. This includes to compute the thermal radiance of the LST image for each parent pixel and to compute the thermal radiance for each of the sub-pixels in the block. (9) Reversing temperature from the thermal radiance of the sub-pixels to generate the final result of decomposing the LST image.
The above procedures are illustrated in Figure 4 showing the technical process of the approach.

Experiments for Validation of the Approach
The best way to validate the applicability of the approach is to compare the decomposed LST image with the simultaneous measurement of LST in the same geographical region. Since ASTER was on board the same Terra platform as MODIS for Earth observation, we can use ASTER LST product to validate the decomposed LST image if we can find a pair of ASTER and MODIS images matching precisely the acquisition time and place. We can use the ASTER LST product to validate the decomposed LST image if we can find a pair of ASTER and MODIS images matching precisely the acquisition time and place. Spatial resolution of ASTER LST data is with pixel scale of 90 m. Therefore we can decompose the MODIS LST image into the same pixel scale as ASTER LST image to check the accuracy of the approach, which can be assessed through the root mean error, standard deviation and mean square error between the ASTER LST and the decomposed LST image: where ME, STD and RMSE are respectively the mean error, standard deviation and root mean square error of the decomposed LST image from the MODIS LST image in comparison with the ASTER LST image, LSTm and LSTa are the decomposed LST and the ASTER LST respectively for the corresponding pixels in scale of 90 m. N is the pixel number of the ASTER LST image.
In order to conduct the validation, a series of procedures have to be conducted for computation of LST error statistics indices in Equations (24)- (26). As shown in the Figure 5, the decomposition process has some difference with the MODIS decomposition with its VNIR. The difference is mainly focus on the thermal radiance calculation with ASTER and MODIS. Because the wavelength range of the MODIS and ASTER thermal bands are different, and the effective emissivity varies with different wavelength range. The radiance cannot be compared directly between MODIS and ASTER. So the broadband emissivity (BBE) is used to calculate the thermal radiance of MODIS and ASTER in wavelength range 8-13.5 μm. Ogwa et al., showed that surface broadband emissivity could be estimated as a linear combination of the narrowband emissivity estimates in the range of 8-13.5 μm [33,44] where εbb_ast and εbb_mod are respectively the BBE of MODIS and ASTER, ε10 to ε14 are the ASTER thermal narrowband emissivity from band 10 to 14. ε29 and ε31 are the MODIS thermal narrowband emissivity for bands 29 and 31. Similarly the thermal radiance of ASTER and MODIS pixel at 8-13.5 μm could be calculated as follow: Then procedures to conduct MODIS LST pixel (1000 m) decomposition into ASTER LST (90 m) with the approach can be summarized as follows: (1) MODIS thermal radiance calculation: we will calculate the broadband emissivity with the MODIS emissivity of band 29 and 31. Then the MODIS thermal radiance at the wavelength range 8-13.5 μm will be calculate by the broadband emissivity and MODIS LST. Re-sampling the MODIS thermal radiance to 990 m with the nearest neighbor method.

Decomposition of MODIS LST with Its VNIR Data
We applied the procedures of our DSDP approach to MODIS LST images for pixel decomposition with its VNIR data. Initial temperature for each sub-pixel in the decomposed image needs to be firstly estimated so that the weight can be computed for the decomposition. This is done on the basis of the relationships respectively between LST and NDVI for natural terrain, LST and NDBI for urban surface, LST and WCI for water surface. The relationship between LST and remote sensing indices could be obtained from the MODIS LST product and MODIS VNIR auxiliary data as shown in Table 2. Using the relationships established in Table 2, we are able to estimate the initial temperature for the decomposed image and to compute the weight for the decomposition. Then we can decompose the experiment MODIS LST image into the pixel scale of 250 m with the procedure in Section 3.6. Figure 6a shows the MODIS LST image from the MOD11A1 product with the pixel scale of 1,000 m and Figure 6b shows the decomposed result of the MODIS LST image into the pixel scale of 250 m by DSPD approach in the Washington DC and Maryland region. Very similar distribution of the LST can be seen in the two images. Since Figure 6a has a pixel scale of 1000 m while the scale in Figure 6b is 250 m, many more details of LST variation in the Washington DC and Maryland region can be highlighted in Figure 6b than in Figure 6a. The low LST along the Potomac River flowing through Washington DC cannot be clearly identified in the MODIS LST image (Figure 6a), while it is clearly seen in the decomposed LST image (Figure 6b) Many more hot points in Washington DC are shown in the urban area through the decomposed LST image (Figure 6b).

Comparison with the ASTER LST for Validation
Using the procedure shown in Figure 5, we are able to decompose the entire MODIS LST image into the pixel scale of 90 m. Figure 7a shows the decomposed result of the MODIS LST image. The accuracy of the decomposed LST (Figure 7a) can be assessed through comparison with the simultaneous measurement of LST (Figure 7b) obtain from the ASTER LST product. Very similar distribution of the LST can be seen in the two images (Figure 7a,b), demonstrating that the decomposed LST from the coarse spatial resolution of MODIS LST image is applicable.
In order to demonstrate the applicability of our approach, we also conducted the pixel decomposition of MODIS LST image using the E-DisTrad method and the re-sampled approach used in general image processing. Both E-DisTrad and re-sampled methods are unable to maintain the rule of thermal radiance unchanged after decomposition. Figure 7c,d show the results of these two methods. Obvious difference can be seen between the decomposed LST image by the E-DisTrad method and the re-sampled approach.
Much details of LST change can be seen in the result of E-DisTrad method, while the LST distribution tends to be even in the result of re-sampled approach. On the other hand, similar LST distribution can be seen in the results from our approach and the E-DisTrad method because both are based on the relation between LST and remote sensing indices. However, detailed comparison reveals that our decomposed LST image is closer than the decomposed LST image of E-DisTrad method to the ASTER LST image. Table 3 compares the decomposition accuracy of our approach with the E-DisTad and re-sampled methods. As indicated in Table 3, ME and RMSE of our DSPD method is the smallest among the three methods for comparison, which demonstrates that our approach is the best among the three. For the entire image, the ME of our DSPD is −1.29 K, which mean the decomposition result is totally 1.29 K lower than its true result, and the RMSE is 2.7 K, which means that the decomposition accuracy is with an average error of ~2.7 K in spite of its general applicability. This is mainly attributed to the uncertainty of LST change with NDVI and NDBI used to estimate the initial temperature for the sub-pixels. Comparison has been done to each of the three typical surface patterns. As seen in Table 3, our DSPD method has the lowest RMSE in all three surface patterns. For natural terrain, the RMSE is 2.1 K for our DSPD, 2.5 K for E-DisTrad and 3.1 K for resampling. For urban surface, the RMSE is 3.6 K for our DSPD, 4.1 K for E-DisTrad and 4.3 K for resampling (Table 3).  Figure 1d.
The re-sampling method has the highest RMSE for natural terrain and urban surface and the highest RMSE of E-DisTrad method is in water body. Figure 7c indicates that the E-DisTrad method estimates LST to be around 304-306 K for the water body while the ASTER LST is generally <300 K (Figure 7b). Thus it has an overestimate of ~4 K. Precise comparison has been also conducted to typical areas in the image. Four typical areas were choose for the comparison (Figure 1d). Area A is mainly urban surface while area B is natural terrain. Area C is mainly water body and area D is a mixture of urban surface and natural terrain. Table 3 shows that our DSPD approach has the lowest RMSE among the three methods in these typical areas in comparison of the decomposed LST with the ASTER LST. Our DSPD approach has a RMSE of 2.6 K in area A and 1.8 K in area B, while the RMSE of E-DisTad method is 3.3 K and 2.4 K respectively for areas A and B. The RMSE of re-sampled method is the highest, implying that the method is the worst in decomposing the pixels of coarse LST images into the fine images with high spatial resolution.

Discussion with the Influence of Initial Temperature on DSPD Results
Several studies such as Agam et al. [29] and Essa et al. [27] used the initial temperature as the final decomposed results because they only based on the relationship between LST and remote sensing indices for the decomposition. However, this is not the best result, because it does not maintain the thermal characteristics of pixel scale to be unchangeable after the decomposition. And Zhu et al., found the relationships of LST and remote sensing indices are variable at different spatial resolution [11]. Meanwhile, the remote sensing indices value ranges are also changeable with spatial resolution. e.g., the maximum NDBI values of MODIS and ASTER are respectively 0.2 and 0.5. So when we used the ASTER NDBI and NDBI-LST relationship of MODIS to obtain an initial decomposed temperature, the initial decomposed temperature may higher than its true value. This higher estimation has been apparently reflected by the scatter plot (Figure 8a) and the difference statistic chart (Figure 9a) between initial decomposed temperature and ASTER LST. While this inaccuracy LST estimation has got some improvement after we used the DSPD method based on the initial decomposed temperature as shown in Figure 8b and Figure 9b.
We respectively made the scatter plot by the result of E_DisTrad and DSPD method with the ASTER LST respectively, the regression line of DSPD is much closer to the red 1:1 reference line with the slop of DSPD regression line (Figure 8b) is 0.65 better than that of E_DisTrad method 0.58 (Figure 8a). The difference statistic chart of the decomposed LST and ASTER LST (Figure 9b) also indicated a satisfactory estimation with our DSPD method than E_DisTrad method.  Since the E-DisTrad method may generate some error in the LST disaggregation process, we need to make some hypothesis to evaluate how much the error in the initial temperature estimated by E-DisTrad method would finally influence the DSPD result. For a MODIS LST pixel could decompose to 16 sub-pixels with its VNIR data, it is difficult to evaluate error of each sub-pixel could influence the final result. So we make an extreme hypothesis, the 16 sub-pixels composed by 15 vegetation sub-pixels (low LST) and 1 urban sub-pixel (high LST), and the initial temperature error of vegetation and urban sub-pixels is in the range of ±3 K. Under this hypothesis, the MODIS LST and the parameters of its sub-pixels were given in Table 4. The object of DSPD method is to maintain the thermal radiance to be unchangeable after the decomposition. If the 15 vegetation sub-pixels among total 16 sub-pixels have the same estimated initial temperature error, the thermal radiance conducted by this error will intentionally reflect on the other one urban sub-pixel. As shown in Figure 10, we give the DSPD LST error of urban sub-pixel and vegetation sub-pixels under different initial temperature estimation error. An interesting phenomena will be found, if the vegetation initial temperature error and the urban initial temperature error increase or decrease together, the DSPD LST result of urban and vegetation will both have little error with the green color, because the weight for the sub-pixels change little under this situation. Meanwhile, if the vegetation initial temperature error and the urban initial temperature error increase or decrease inversely, the DSPD LST result of urban and vegetation will have high error with the blue or red color. This means if the 15 vegetation sub-pixels initial temperature are higher estimated and urban sub-pixel initial temperature are lower estimated, the DSPD urban sub-pixel LST will be much lower than its true LST (the blue region of Figure 10a) while the DSPD vegetation sub-pixel LST will be little higher than its true LST (the red region of Figure 10b). Under the maximum situation, the vegetation sub-pixels initial temperature have 3 K higher estimation and the urban sub-pixel initial temperature has 3 K higher estimation, the DSPD urban sub-pixel LST will be 5.7 K lower than its true LST while the DSPD vegetation sub-pixels LST will only be 0.4 K higher than its true LST. And if the 15 vegetation sub-pixels initial temperature are lower estimated and urban sub-pixel initial temperature are higher estimated, the DSPD urban sub-pixel LST will be much higher than its true LST (the red region of Figure 10a) while the DSPD vegetation sub-pixel LST will be little lower than its true LST (the blue region of Figure 10b). Although the DSPD may obtain a large error under the maximum situation in the urban sub-pixel, the error RMSE of total 16 sub-pixels with DSPD is 0.8 K much less than the RMSE of initial temperature RMSE of 3 K (Figure 11), so the error RMSE of DSPD is totally better than the initial temperature.

Applying the Approach to Another Case
We applied the procedures of our DSDP approach to another site of Shanghai in east of China (Figure 12a). With rapid development of economy, urbanization is very obvious in the region, especially Shanghai City, the LST product with high spatial resolution is the essential data for urbanization study. As shown in Figure 12. The ASTER and MODIS LST in Shanghai area were acquired at 02:47:07 Greenwich time (10:47:7 local time) on 1 August 2000. Figure 12c shows the subset of the MODIS LST image and Figure 12d is the subset of the ASTER LST image covers Shanghai region, the Taicang city and Jiading district. The decomposed results of E_DisTrad ( Figure 12e) and our DSPD (Figure 12f) all could reflect the spatial distribution of high LST in the urban building area and low LST in the vegetation and water area. The decomposed LST result of E_DisTrad has an apparently high estimation compared with the ASTER LST data. The LST in vegetation area with about 306 K is highly estimated to 310 K, and the hot points in the urban area are much more. This high estimation result has been effectively decreased after our DSPD method. The decomposed LST result of DSPD in the vegetation area is lower than the E_DisTrad result and much similar with the observed ASTER LST. We would find the decomposed LST result of DSPD in the urban area of Taicang city and Jiading district is about 2 K lower than ASTER LST. This is mainly attributed to the discrepancy between the ASTER and MODIS LST product in this area [45]. Therefore our DSPD method provides an alternative approach to improve the spatial resolution of thermal remote sensing system such as MODIS with coarse pixel scale.

Conclusions
An efficient approach has been proposed in the study to increase the spatial resolution of MODIS LST image. The theoretical basis of the approach is that the change of LST is with accordance with characteristics of surface patterns. Therefore, using a scheme to classify the image pixels into three surface patterns and establish the relationship between LST and NDVI or NDBI, we are able to decompose the pixel scale of the MODIS LST image into the same pixel scale of MODIS VNIR bands which are used as auxiliary data for the decomposition. Since the pixel scale of MODIS VNIR bands is 250 m, we are able to increase the spatial resolution of the MODIS LST image into 3 times, i.e., to decompose the pixel scale of the MODIS LST image from its original 1000 m into 250 m in the resulted LST image. The unique feature of this pixel decomposition approach is to maintain the thermal radiance of parent pixels in the MODIS LST image unchanged after they are decomposed into the sub-pixels in the resulted image. There are two important steps in the decomposition: initial temperature estimation for the sub-pixels in the resulted LST image and final temperature determination through weighting approach to maintain the thermal radiance unchanged after decomposition. Therefore it is termed the double-step pixel decomposition (DSPD).
A series of procedures have been developed for the pixel decomposition approach. These include re-sampling the MODIS VNIR data, classification of the image pixels into three surface patterns, establishment of the relationship between LST and NDVI or NDBI or WCI for the three surface patterns, computation of weight for each sub-pixel, and determination of final temperature for the sub-pixels. An experiment was conducted in the study to validate the applicability of the approach. Because ASTER and MODIS are on board the same platform for Earth observation, the ASTER LST which has much higher spatial resolution than the MODIS LST image was used as the true LST measurement to compare with the decomposed LST from the MODIS LST image. We compared the DSPD approach with the E-DisTrad method and re-sampling method.
The comparison results indicate our DSPD approach is the best among the three. RMSE of our DSPD is 2.7 K for the entire image used for the experiment. Re-sampling method has the highest RMSE, implying that it is the worst for pixel decomposition. This is mainly because the method does not consider the effect of surface patterns on LST variation in its decomposition, while the DSPD and E-DisTrad methods are based on this effect for decomposition. The obvious difference of our DSPD from the E-DisTrad is that the thermal radiance remains unchanged after decomposition while E-DisTrad does not emphases this principle.
In spite of general applicability, the decomposition accuracy is not as high as expected. For our DSPD, its RMSE indicates that the decomposition is with an average error of ~2.7 K. This is mainly attributed to the uncertainty of LST change with NDVI and NDBI used to estimate the initial temperature for the sub-pixels. Decomposition accuracy is different in different surface patterns. The lowest RMSE can be seen in the water body, which is followed by natural terrain. The urban region has the highest RMSE for the decomposition. This is due to the fact that LST has the lowest fluctuation in water bodies and the greatest variation in urban region.
The DSPD approach has been applied to MODIS LST images in east USA and east China for pixel decomposition to increase their spatial resolution. Since the MODIS NVIR band data were used as assistance for the decomposition, the resulted LST images from the decomposition efficiently increase the spatial resolution in both cases. The pixel scale of MODIS LST images were 1000 m while the scale of the decomposed LST images is 250 m, implying that the spatial resolution of the LST image has increased by three times after decomposition. The obvious increase in spatial resolution remarkably uplifts the capability of the LST images in applying to the studies requiring high spatial resolution of LST distribution to highlight the tiny variation of thermal flux over ground surface. Much more details of LST variation can be clearly identified in the decomposed LST image than in the original MODIS LST. Therefore it can concluded that the DSPD approach provide an alternative opportunity to improve the spatial resolution of MODIS LST images hence expand the applicability of the images in the real world.