Quantification of the Scale Effect in Downscaling Remotely Sensed Land Surface Temperature

Most current statistical models for downscaling the remotely sensed land surface temperature (LST) are based on the assumption of the scale-invariant LST-descriptors relationship, which is being debated and requires an in-depth examination. Additionally, research on downscaling LST to high or very high resolutions (~10 m) is still rare. Here, a simple analytical model was developed to quantify the scale effect in downscaling the LST from a medium resolution (~100 m) to high resolutions. The model was verified in the Zhangye oasis and Beijing city. Examinations of the simulation datasets that were generated based on airborne and space station LSTs demonstrate that the developed model can predict the scale effect in LST downscaling; the scale effect exists in both of these two study areas. The model was further applied to 12 ASTER images in the Zhangye oasis during a complete crop growing season and one Landsat-8 TIRS image in Beijing city in the summer. The results demonstrate that the scale effect is intrinsically caused by the varying probability distribution of the LST and its descriptors at the native and target resolutions. The scale effect depends on the values of the descriptors, the phenology, and the ratio of the native resolution to the target resolution. Removing the scale effect would not necessarily improve the accuracy of the downscaled LST.


Introduction
Land surface temperature (LST) is one of the most important parameters that are related to the energy balance and water cycle on the earth's surface.It acts as a necessary input of models in various disciplines such as climatology, meteorology, hydrology, agriculture, environmental science, and ecology [1][2][3][4].Satellite thermal infrared (TIR) remote sensing provides a good data source for deriving LSTs.Some of the satellite LST products, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) LST product [5] and the Spinning Enhanced Visible and Infrared Imager (SEVIRI) LST product [6], have been generated operationally and have greatly contributed to the modeling of land surface processes at various scales.
Current satellite remote sensing faces a crucial challenge: it is extremely difficult to acquire images that have both high temporal and spatial resolutions due to tradeoffs among these resolutions [7][8][9].Most satellite TIR sensors have been designed with medium (i.e., ~100 m) to low (i.e., hundreds of Remote Sens. 2016, 8, 975 2 of 24 meters to thousands of meters) spatial resolutions; thus, thermal mixture is inevitable in TIR images.However, LSTs that have high (i.e., tens of meters) or even very high (i.e., ~10 m) resolutions are desirable in many applications that are related to fragmental landscapes, such as surface evapotranspiration (ET) mapping in agricultural fields and heat island monitoring in urban areas [9][10][11][12].
In the most recent two decades, LST downscaling has attracted the interest of scientists, and relevant research has culminated in the past decade [13].Models for LST downscaling can be categorized as statistically linear or nonlinear, modulated, hybrid models [14] and those based on land surface model inversion [15,16].Due to their simplicity and robustness, the statistical models have become the most prevalent.Two classical statistical models are the DisTrad model (the disaggregation procedure for radiometric surface temperature) [8] and its descendant, the TsHARP model (the algorithm for sharpening thermal imagery) [17].Recently, many improvements on DisTrad and TsHARP have been developed by introducing more descriptors in addition to the normalized difference vegetation index (NDVI) and fractional vegetation cover (FVC), such as the indices of non-vegetated surfaces (soil, built-up area, urban, impervious surface, etc.), surface reflectance, and surface albedo [14,[18][19][20][21]. Successful applications of the statistical models have been reported from case studies in the agricultural area [22][23][24], urban area [7,18,20,25,26], and other areas, which also could have high heterogeneity [14,27,28].
Although the statistical models provide an effective way to improve the spatial resolution of satellite LSTs, they have the following problems.The first problem lies in the basic assumption of LST downscaling, which assumes the LST-descriptors relationship is scale-invariant [8,17].This assumption is, however, being debated [24,[29][30][31].Jeganathan et al. (2011) reported evident variations in both the slope and intercept in the statistically linear model at resolutions that are coarser than 90 m, and a resolution-adjusted model was developed to address the scale effect in downscaling LSTs over a mixed agricultural landscape [24].An error estimation method that employs the equivalent random sample size was developed by [31] to reflect the scale effect, and the modified TsHARP had a higher accuracy.Ghosh and Joshi (2014) also reported the existence of the scale effect in LST downscaling and found that the scale effect is a function of the surface characteristic and regression algorithm [30].However, these investigations on the scale effect have been hindered by insufficient remote sensing data; thereby, how the scale effect affects downscaling remains uncertain.The second problem exists in the native and target spatial resolutions for downscaling LSTs.Many researchers focus on downscaling LSTs from low resolutions to medium resolutions, but the current remotely sensed LSTs with high resolutions or even very high resolution are rare and unable to support applications such as the microscale mapping of surface ET and urban heat islands (UHI) [9,10].Although there are a few instances for downscaling satellite LST to high resolutions [22,30,[32][33][34], the scale effect problem has been neglected or examined only qualitatively due to the lack of sufficient remote sensing data.
In this context, this study aims to quantify the scale effect in LST downscaling.Note that the scale effect is defined as the error in the downscaled LST at the target resolution, whereby the error is induced by the LST-descriptors relationship at the native resolution.By selecting an agricultural oasis and a suburban area as the study areas, four statistical models derived from DisTrad are applied to the Terra Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and Landsat-8 Thermal Infrared Sensor (TIRS) LSTs to understand the consequent scale effects.The target resolutions in downscaling are 15 m for ASTER and 30 m for TIRS, the same as the resolutions of the multispectral channels of ASTER and Landsat-8 Operational Land Imager (OLI).
This paper is organized as follows.Section 2 describes the study areas and datasets.Section 3 presents the methodology for quantifying the scale effect.The results are presented in Section 4. Uncertainties with this study and other issues are discussed in Section 5. Some conclusions are drawn in Section 6.

The Zhangye Oasis
The first study area is an agricultural oasis, the Zhangye oasis, which is located in the middle reaches of the Heihe River Basin (HRB), in northwest China (Figure 1).The Zhangye oasis has a semi-arid climate.Its terrain is flat, with an elevation of approximately 1480 m.It is surrounded by barren land, e.g., sandy desert, desert steppe, and the Gobi desert.Within the oasis, the land surface is mainly covered by maize, with a growing season that lasts from May to September.There are also bare soil, built-up surfaces, vegetable fields, orchards, and shelter forests.The landscape heterogeneity of this oasis therefore greatly decreases from May to July and August, but then, it increases slightly in September because of the crop growth.The agricultural fields in the Zhangye oasis are typically on the order of ~100 hectares or ~10 4 m 2 .Therefore, such a dynamical landscape requires ~10 m LSTs to map the surface ET at the sub-field scale; it also facilitates the evaluations of downscaling models under very different scenarios.Due to the semi-arid climate, the Zhangye oasis has an urgent need for monitoring and managing the water resources during the crop growing season [35,36].Although satellite LSTs such as MODIS (with a 1-km resolution) and ASTER (with a 90-m resolution) have been applied in estimating the surface ET in the Zhangye oasis, the users still require satellite LST with a much higher resolution to better manage the water resources in this area [36,37].
The ground measurements were provided by the first thematic experiment of the Heihe Watershed Allied Telemetry Experimental Research (HiWATER), namely, the MUlti-Scale Observation EXperiment on Evapotranspiration (HiWATER-MUSOEXE) [35,38,39].A flux observation matrix was constructed in the study area from May to September 2012.The longwave radiation and other meteorological variables were collected in the core experimental area (5.5 km × 5.5 km), within which a total of 17 automatic meteorological stations (AMSs) were distributed, including one in a vegetable field (EC01), one in a residential area (EC04), one in an orchard (EC17), and 14 in maize fields (Figure 1).EC16 was excluded because it provided only the surface net radiation.At the other 16 sites, the incoming and outgoing longwave radiations were recorded every 10 min by four-component radiometer sets, including seven CNR1 sets and seven CNR4 sets produced by Kipp and Zonen, one NR01 set produced by the Hukseflux, and one PSP and PIR set produced by the Eppley.Intercomparison of these radiometer sets demonstrated that they had good agreement [39]: the average errors of the measured incoming and outgoing longwave radiation were ~6 W/m 2 and ~3 W/m 2 , respectively, which induce an uncertainty of ~0.6 K in the LST.The longwave radiation measurements were then converted to LSTs.Details can be found in [40].The calculated LST will be used to validate the downscaled ASTER LST.
To better understand the scale effect in the LST downscaling in a complete crop growing season, 10 daytime ASTER images in 2012 were used in this study, including one image in May (30 May), two images in June (15 and 24 June), one image in July (10 July), four images in August (2, 11, 18 and 27 August), and two images in September (3 and 12 September).Two additional ASTER images were acquired on 19 and 28 September in the harvest period.Note that these two images were acquired after HiWATER-MUSOEXE, and thus, no in situ measured LSTs were available.All of the ASTER images were acquired under clear skies except for a few clouds that existed in the southern part of the oasis on 12 September.The overpass times for these ASTER images were 04:12 or 04:19 UTC.
ASTER products in levels 1B and 3A01 were purchased from the ASTER Ground Data System of the Earth Remote Sensing Data Analysis Center of Japan.The spatial resolutions of the visible and near infrared (VNIR), short-wave infrared, and TIR channels are 15, 30, and 90 m, respectively.Atmospheric correction was applied to the VNIR channels with MODTRAN5.2.2 code and in situ measured atmospheric profiles to obtain the surface reflectance [41].With the assistant of MODTRAN5.2.2, the LSTs were estimated through the radiative transfer equation with in situ atmospheric profiles acquired at the Zhangye National Climate Observatory (39 • 05 N and 100 • 17 E) [42][43][44].The land surface emissivity was determined with the NDVI threshold method based on the in situ emissivities of the vegetation, soil, and other typical surfaces [45].Thermal Airborne Spectrographic Imager (TASI) images of 17 stripes that covered the entire study area were collected at 02:45-07:10 UTC on 10 July 2012.The resolution of the TASI images was 3 m.The LSTs were estimated through a temperature-emissivity separation method with an accuracy of better than 1.5 K [46].The LSTs of different stripes were temporally normalized to the overpassing time of ASTER on July 10, with a diurnal temperature cycle model [47].Different stripes of the TASI images were mosaicked and then carefully co-registered with the ASTER images.

Beijing City
The second study area is a suburban area of Beijing city, which is located in Beijing municipality (39°25′N-41°03′N, 115°25′E-117°30′E) in north China (Figure 1).Beijing has a hot and humid summer and a cold and dry winter.The relative humidity is approximately 72%-74% in summer and 44%-47% in winter; 75% of the annual precipitation occurs in summer [12].As the capital of China, Beijing has a population of over 20 million and a built-up area of over 1200 km 2 .In recent years, Beijing has suffered from severe UHI effects due to rapid urbanization, physical conditions, and intense human activities.
Landsat-8 OLI and TIRS images that covered Beijing city at 02:55 UTC on 31 July 2013 were collected.The spatial resolutions are 30 m for the OLI multispectral images and 100 m for the TIRS images.In addition, one image acquired by the TIR channel of the hyperspectral imager on board the Tiangong-1 space station at 06:35 UTC on 30 July 2013 was collected.Tiangong-1 was launched by China on 29 September 2011, and it operates at an altitude of 340 km.The effective wavelength and spatial resolution of the Tiangong-1 TIR channel are 9.0 μm and 10 m, respectively.Both the Landsat-8 and Tiangong-1 data were acquired under clear skies.Note that the Tiangong-1 image Thermal Airborne Spectrographic Imager (TASI) images of 17 stripes that covered the entire study area were collected at 02:45-07:10 UTC on 10 July 2012.The resolution of the TASI images was 3 m.The LSTs were estimated through a temperature-emissivity separation method with an accuracy of better than 1.5 K [46].The LSTs of different stripes were temporally normalized to the overpassing time of ASTER on 10 July, with a diurnal temperature cycle model [47].Different stripes of the TASI images were mosaicked and then carefully co-registered with the ASTER images.

Beijing City
The second study area is a suburban area of Beijing city, which is located in Beijing municipality (39 1).Beijing has a hot and humid summer and a cold and dry winter.The relative humidity is approximately 72%-74% in summer and 44%-47% in winter; 75% of the annual precipitation occurs in summer [12].As the capital of China, Beijing has a population of over 20 million and a built-up area of over 1200 km 2 .In recent years, Beijing has suffered from severe UHI effects due to rapid urbanization, physical conditions, and intense human activities.
Landsat-8 OLI and TIRS images that covered Beijing city at 02:55 UTC on 31 July 2013 were collected.The spatial resolutions are 30 m for the OLI multispectral images and 100 m for the TIRS images.In addition, one image acquired by the TIR channel of the hyperspectral imager on board the Tiangong-1 space station at 06:35 UTC on 30 July 2013 was collected.Tiangong-1 was launched by China on 29 September 2011, and it operates at an altitude of 340 km.The effective wavelength and spatial resolution of the Tiangong-1 TIR channel are 9.0 µm and 10 m, respectively.Both the Landsat-8 and Tiangong-1 data were acquired under clear skies.Note that the Tiangong-1 image covered only a suburban area of Beijing city due to its limited scene width (Figure 1).The Tiangong-1 image was then co-registered with the OLI image.
Atmospheric correction of the OLI image was conducted based on MODTRAN and the MODIS atmospheric profile product (i.e., MOD07).After the atmospheric correction, the surface reflectance values were obtained.Estimation of the LSTs from the TIRS and Tiangong-1 data followed the radiative transfer equation, after the atmospheric parameters were obtained from the MODTRAN and MODIS atmospheric profiles (MOD07 for TIRS and MYD07 for Tiangong-1).The land surface emissivities required in estimating the LST were determined from the NDVI threshold method, and the emissivities of the different land cover types were determined according to [48].

LST Downscaling Models
DisTrad uses a quadratic function to link NDVI and LST [8].Considering that (1) the quadratic function that links NDVI and LST could be highly sensitive, and (2) NDVI is less capable of expressing LST with dense vegetation, Agam et al. ( 2007) developed the TsHARP model, which employs a linear function and FVC as the LST descriptor [17].Indeed, some other surface parameters can also explain the spatial variations of the LST [14].Therefore, the statistical downscaling model can be extended to the following generalized form: where T s is LST; the hat denotes the predicted values; the subscripts N and T denote the native and target resolutions, respectively; T s,N is retrieved from the original TIR data; ∆ Ts,N is the LST residual at the native resolution; f denotes the statistical linear or nonlinear regression function; and x is the vector of LST descriptors.
According to the surface characteristics of the Zhangye oasis, two types of surface parameters were selected as potential LST descriptors.The first type was vegetation related, including NDVI and FVC [17,49].The second type of descriptor was the surface broadband shortwave albedo.The surface albedo was selected because it influences the surface radiation balance and is mostly negatively related to the soil moisture over barren agricultural areas [50], two determinants that directly affect the surface energy balance and therefore the LSTs, especially during the daytime.The surface albedo was calculated based on the spectral reflectance of the ASTER VNIR channels, with an empirical equation developed by comprehensive radiative transfer simulation for snow/ice free surfaces [51].
For Beijing city, NDVI and FVC were selected as the first type of descriptor, and the normalized difference built-up index (NDBI) was selected as the second type of descriptor.NDBI was found to have a good ability to explain the spatial variation of LST in urban areas and be a good descriptor in downscaling urban LSTs [18].It is calculated based on the surface reflectance values of OLI band 6 and band 5 [52].
The downscaling also relies on selections of pixels within a certain spatial extent.Three different types of models were examined here, including the global regression model (GRM), the piecewise regression model (PRM), and the local regression model (LRM).GRM used all but the water pixels.PRM was developed to address the possible nonlinearity between LST and NDVI [24].Here, the pixels except for the water pixels were divided into three groups, including NDVI < 0.2, 0.2 ≤ NDVI ≤ 0.5, and NDVI > 0.5, and f was trained for each group separately.LRM uses a moving window approach and constructs f for each window separately.A fixed window size, such as 3 × 3 pixels, was commonly utilized.However, f obtained with a fixed window size might not be optimal.Two automatic schemes were therefore used to determine the optimal window size.For each pixel, the window size was set to range from 3 × 3 pixels up to n × n pixels with an increment of two pixels each time (n is the minimum value of the numbers of lines and columns).The first scheme, hereafter termed LRM-MCD, used the regression coefficient of determination (R 2 ) as the criterion by which the window size with the maximum R 2 was identified.The second scheme, hereafter termed LRM-LPR, employed the predicted residual as the criterion by which the window size with the lowest prediction residual was identified.Both schemes indicate that the window size for the central pixel is optimal only when f yields the highest accuracy.

Quantification of the Scale Effect
One major error source in the current LST downscaling models is from the sub-pixel variations of LST that cannot be explained by the selected descriptors at the native resolution.For example, the lower capability of the LST descriptors used in DisTrad and TsHARP to explain the sub-pixel LST variations induced by soil moisture may contribute to the boxy discontinuities in the downscaled LSTs [53].According to Equation (1), the error of the downscaled LST at the target resolution can be expressed as: where T s,T is the true value of LST at the target resolution.Similar to Equation ( 1), LST at the target resolution can also be modeled as a function of the descriptors: where f T is the regression function at the target resolution.
Inserting Equation (3) into Equation (2) yields: Thus, the error of the downscaled LST can be categorized into two parts, as shown in the two terms in the brackets on the right side of Equation (4).The first term is still related to the sub-pixel variations of LST that cannot be explained by the selected descriptors.The second term is induced by the possible scale-variant relationship between LST and its descriptors.Therefore, it expresses the scale effect in LST downscaling and will be denoted by ∆T s,scale hereafter.If the quadratic term in DisTrad is omitted, ∆T s,scale can be expressed as: where a and b are the intercept and slope in the regression function, respectively; k is the number of descriptors; x i denotes the i-th (i = 1, 2, . . ., k) descriptor.Since f is trained through linear regression, the mean LSTs at the native and target resolutions satisfy the equation below: where T s,N and T s,T are the mean LSTs at the native and target resolutions, respectively; x N and x T are the vectors of the mean values of descriptors at the native and target resolutions, respectively; x N,i and x T,i are the mean values of the i-th descriptors at the native and target resolutions, respectively.
Then we can obtain: After subtracting Equation (7) from Equation (5), ∆T s,scale can be expressed as: Equation ( 8) is a generalized form to quantify the scale effect in LST downscaling when the regression function is linear.For downscaling models, in which f N and f T are trained based on the same spatial extents, the mean values of LST at the native and target resolutions should be equal, as well as the descriptors.Equation ( 8) can be converted to: According to Equation ( 9), the key step to quantify the scale effect is to determine the slopes (i.e., b T,i ) in the regression function at the target resolution.Based on the LST and descriptors at the native resolution, a series of LST and descriptors at coarser resolutions can be created through upscaling.We hypothesize that b T,i can be predicted by extrapolating the statistical function of b i trained at the native and lower resolutions.This hypotheses and the extrapolation process will be analyzed in Section 4.
∆T s,scale should be zero once the LST-descriptors relationship is scale-invariant.To remove the scale effect, Equation (1) should be revised as:

Implementation and Evaluation of the Downscaled LSTs
The whole flow of this study is shown in Figure 2. The downscaling models were first applied to the Zhangye oasis on 10 July 2012 and Beijing city on 30 July 2013.To exhibit the scale effect qualitatively, the TASI LST image in the Zhangye oasis was upscaled from the 3-m resolution to a series of coarser resolutions from 15 to 990 m with an increment of 15 m using the linear aggregation technique [29,54]; similarly, the Tiangong-1 LST image in Beijing city was upscaled from 10 m to resolutions from 30 to 990 m with an increment of 30 m.To better simulate the TIRS LST, the Tiangong-1 LST was also aggregated to the 100-m resolution.Both the upscaled TASI and Tiangong-1 LSTs were then downscaled from 90 to 15 m (for TASI) and from 100 to 30 m (for Tiangong-1).Note that downscaling was not applied to the simulated LSTs that were coarser than 90 m (for TASI) or 100 m (for Tiangong-1) because the purpose of this study is to downscale the LST from the native resolution to the resolution of the VNIR bands of sensors on the same satellite.With the simulation datasets, the possibility of predicting the slopes b T,i was tested.Then, the ASTER and TIRS LSTs were downscaled.
To better understand the impacts of LST descriptors on downscaling, comparisons were made between the downscaled LSTs in two cases (referred to as Case A and Case B).In Case A, a single vegetation parameter (NDVI or FVC) with the best regression accuracy was selected as the descriptor; in Case B, both the surface albedo (in the Zhangye oasis) or NDBI (in Beijing city) and one of the two vegetation parameters were used, and the selection of the vegetation parameter also relied on the regression accuracy.
In the Zhangye oasis, the downscaled TASI LSTs were evaluated in two ways.First, they were compared against the TASI LST that was upscaled to a 15-m resolution.Second, they were upscaled back to 90 m and then compared against the TASI LST that was upscaled to a 90-m resolution.The downscaled ASTER LSTs were validated with the in situ measured LSTs; in addition, they were upscaled back to 90 m and then compared with the ASTER LSTs before downscaling.Because the in situ LSTs were not available in Beijing city, the downscaled LSTs from Tiangong-1 simulated LST and TIRS LST were evaluated by comparing them against the original LSTs.In comparison with the original LSTs, the mean bias difference (MBD) and root-mean squared difference (RMSD) were used because the original LSTs have uncertainties.In validation with the in situ LSTs, the mean bias error (MBE) and RMSE were used.The formulas of MBD and RMSD are: where n is the number of samples; T s,d and T s,o are the downscaled and original LSTs, respectively.When T s,o is replaced by the in situ LST, the calculated results from these two equations are MBE and RMSE.
The image quality index (Q) was calculated to evaluate the image quality of the downscaled LSTs.Q was shown to be powerful in fulfilling this task because it includes a combination of the loss of correlation, luminance distortion, and contrast distortion.Q was calculated as follows [55]: where N and T are the mean LSTs at the native and target resolutions, respectively; σ 2 N and σ 2 T are their variances; and σ NT is the covariance between the LST images before and after downscaling.According to [33] and [27], a moving window method with sizes of 8 × 8 pixels, 16 × 16 pixels, 32 × 32 pixels, 64 × 64 pixels, and 128 × 128 pixels was used to calculate Q, and the mean values were taken for comparison.
back to 90 m and then compared against the TASI LST that was upscaled to a 90-m resolution.The downscaled ASTER LSTs were validated with the in situ measured LSTs; in addition, they were upscaled back to 90 m and then compared with the ASTER LSTs before downscaling.Because the in situ LSTs were not available in Beijing city, the downscaled LSTs from Tiangong-1 simulated LST and TIRS LST were evaluated by comparing them against the original LSTs.In comparison with the original LSTs, the mean bias difference (MBD) and root-mean squared difference (RMSD) were used because the original LSTs have uncertainties.In validation with the in situ LSTs, the mean bias error (MBE) and RMSE were used.The formulas of MBD and RMSD are: where n is the number of samples; s,d T and s,o T are the downscaled and original LSTs, respectively.When s,o T is replaced by the in situ LST, the calculated results from these two equations are MBE and RMSE.
The image quality index (Q) was calculated to evaluate the image quality of the downscaled LSTs.Q was shown to be powerful in fulfilling this task because it includes a combination of the loss of correlation, luminance distortion, and contrast distortion.Q was calculated as follows [55]: where N and T are the mean LSTs at the native and target resolutions, respectively; σ is the covariance between the LST images before and after downscaling.According to [33] and [27], a moving window method with sizes of 8 × 8 pixels, 16 × 16 pixels, 32 × 32 pixels, 64 × 64 pixels, and 128 × 128 pixels was used to calculate Q, and the mean values were taken for comparison.

Scale Effect Quantified with TASI Data
With the TASI simulated LSTs and ASTER derived parameters, the linear regression functions were trained at different resolutions.The R 2 values of the function of LST-NDVI are shown in Figure 3a.R 2 increases from 0.545 at 15 m to 0.824 at 90 m and remains stable at approximately 0.86-0.89after 150 m.All of the regressions are significant at the 0.001 level.Higher regression accuracy is obtained at resolutions that are coarser than 15 m due to having fewer pixel outliers.In contrast, the function of LST-FVC yields lower regression accuracy: R 2 increases from 0.530 at 15 m to 0.803 at 90 m and is ~0.84 after 150 m, which demonstrates that FVC is less capable of explaining the spatial variation of LST than NDVI in this study area on 10 July.Selecting the surface albedo as the second descriptor contributes only a slight increment to the regression accuracy.
Equation ( 9) demonstrates that the LST-descriptors relationship in the linear DisTrad model is scale-invariant only when Equation ( 1) the values of the descriptors are equal to the mean values of all of the selected pixels at the target resolution or Equation ( 2) the slopes at the native and target resolutions are equal.Note that similar results can be inferred for the linear TsHARP model.To verify the validity of Equation ( 9) and the hypotheses in an independent way, the mean values of each parameter (i.e., LST, NDVI, FVC, and surface albedo) at different resolutions were calculated.Furthermore, the scale effect was quantified according to Equation (5) by varying NDVI from 0 to 0.9 in increments of 0.1.The quantified scale effects at four NDVI values are shown in Figure 3b.On the one hand, the mean values of these parameters at different resolutions have ignorable changes.For example, the mean values of LST, NDVI, FVC, and albedo are 304.38 ± 0.03 K, 0.628 ± 0.0, 0.639 ± 0.0, and 0.206 ± 0.0 in the range of 15 to 90 m, respectively.Thus, the transformation of Equation (5) to Equation ( 9) is valid.On the other hand, Figure 3b demonstrates that the scale effect when NDVI = 0.6 is ignorable (~0.1 K) because 0.6 is close to the mean value of NDVI; the scale effect increases when NDVI deviates from the mean value.Thus, it is reasonable to infer that the scale effect in the LST downscaling increases over heterogeneous surfaces.A closer look at Figure 3b further demonstrates that the scale effect is positive when NDVI is lower than the mean value, which reveals that the scale effect has a positive contribution to the downscaling error, and vice versa.Moreover, the scale effect also depends on the ratio of the native resolution to the target resolution.The fluctuations for resolutions that are coarser than 510 m are induced by the pixels that are located near the boundary of the oasis and its surrounding desert.The independent verification based on Equation (5) confirms the validity of Equation ( 9) and the hypotheses.Although the finding from the case when FVC acts as the descriptor is not shown here, similar results are also found.Equation ( 9) demonstrates that the LST-descriptors relationship in the linear DisTrad model is scale-invariant only when Equation ( 1) the values of the descriptors are equal to the mean values of all of the selected pixels at the target resolution or Equation ( 2) the slopes at the native and target resolutions are equal.Note that similar results can be inferred for the linear TsHARP model.To verify the validity of Equation ( 9) and the hypotheses in an independent way, the mean values of each parameter (i.e., LST, NDVI, FVC, and surface albedo) at different resolutions were calculated.Furthermore, the scale effect was quantified according to Equation (5) by varying NDVI from 0 to 0.9 in increments of 0.1.The quantified scale effects at four NDVI values are shown in Figure 3b.On the one hand, the mean values of these parameters at different resolutions have ignorable changes.For example, the mean values of LST, NDVI, FVC, and albedo are 304.38 ± 0.03 K, 0.628 ± 0.0, 0.639 ± 0.0, and 0.206 ± 0.0 in the range of 15 to 90 m, respectively.Thus, the transformation of Equation (5) to Equation ( 9) is valid.On the other hand, Figure 3b demonstrates that the scale effect when NDVI=0.6 is ignorable (~0.1 K) because 0.6 is close to the mean value of NDVI; the scale effect increases when NDVI deviates from the mean value.Thus, it is reasonable to infer that the scale effect in the LST downscaling increases over heterogeneous surfaces.A closer look at Figure 3b further demonstrates that the scale effect is positive when NDVI is lower than the mean value, which reveals that the scale effect has a positive contribution to the downscaling error, and vice versa.Moreover, the scale effect also depends on the ratio of the native resolution to the target resolution.The fluctuations for resolutions that are coarser than 510 m are induced by the pixels that are located near the boundary of the oasis and its surrounding desert.The independent verification based on Equation (5) confirms the validity of Equation ( 9) and the hypotheses.Although the finding from the case when FVC acts as the descriptor is not shown here, similar results are also found.In addition to GRM, the optimal regression functions LRM-MCD and LRM-LPR were trained.As demonstrated by R 2 , the trained functions have good accuracies and are significant at the 0.001 level.The scale effects in LRM-MCD and LRM-LPR were quantified according to Equation ( 9), because the invariant mean values of the LST and descriptors can be guaranteed.However, PRM exhibits exceptions on two aspects.First, no regression function with acceptable accuracy was obtained when NDVI < 0.2, due to the weak ability of NDVI to explain the LST variation over barren or lowly vegetated surfaces; thus, the regression function of GRM was utilized instead.Second, the mean values of the LST and descriptors vary according to the resolution because the spatial extents that provide the pixels in training the regression functions are different at the native and target resolutions.Thus, the scale effect in PRM was quantified according to Equation (5).The spatial distributions of the scale effects in all of the models are shown in Figure 4 by images.
Remote Sens. 2016, 8, 975 10 of 24 In addition to GRM, the optimal regression functions LRM-MCD and LRM-LPR were trained.As demonstrated by R 2 , the trained functions have good accuracies and are significant at the 0.001 level.The scale effects in LRM-MCD and LRM-LPR were quantified according to Equation ( 9), because the invariant mean values of the LST and descriptors can be guaranteed.However, PRM exhibits exceptions on two aspects.First, no regression function with acceptable accuracy was obtained when NDVI < 0.2, due to the weak ability of NDVI to explain the LST variation over barren or lowly vegetated surfaces; thus, the regression function of GRM was utilized instead.Second, the mean values of the LST and descriptors vary according to the resolution because the spatial extents that provide the pixels in training the regression functions are different at the native and target resolutions.Thus, the scale effect in PRM was quantified according to Equation (5).The spatial distributions of the scale effects in all of the models are shown in Figure 4 by images.As shown in Figure 4, the scale effect mainly ranges from −0.5 to 1.5 K for GRM, with negative contributions appearing over vegetated surfaces and positive contributions over built-up and barren surfaces; these results confirm the finding from Figure 3b.The scale effect for PRM is positive for most of the surfaces.No evident differences are found between the quantified scale effects in Cases A and Case B for GRM or PRM.Compared with GRM and PRM, the scale effects of LRM-MCD and LRM-LPR are more significant, and they range from −3.0 to 3.0 K (Figure 4e-h).The spatial distribution of the scale effect is more irregular than on GRM and PRM, which is mostly due to the varying optimal window size for each pixel.The scale effects are closed to zero over most of the agricultural surfaces, which suggests that the local models are more capable of avoiding the scale effect over this type of surface.Several north-south linear patches suffer more from the scale effect.This phenomenon is probably induced by the mosaic of the TASI LST stripes.
The downscaled TASI LSTs were evaluated.At the 15-m resolution, the downscaled LSTs from all of the models with the scale effect retained have very similar MBD and RMSD values: MBD ranges from 0.01 K to 0.04 K, which indicates that no systematic difference exists, and RMSD ranges from  As shown in Figure 4, the scale effect mainly ranges from −0.5 to 1.5 K for GRM, with negative contributions appearing over vegetated surfaces and positive contributions over built-up and barren surfaces; these results confirm the finding from Figure 3b.The scale effect for PRM is positive for most of the surfaces.No evident differences are found between the quantified scale effects in Cases A and Case B for GRM or PRM.Compared with GRM and PRM, the scale effects of LRM-MCD and LRM-LPR are more significant, and they range from −3.0 to 3.0 K (Figure 4e-h).The spatial distribution of the scale effect is more irregular than on GRM and PRM, which is mostly due to the varying optimal window size for each pixel.The scale effects are closed to zero over most of the agricultural surfaces, which suggests that the local models are more capable of avoiding the scale effect over this type of surface.Several north-south linear patches suffer more from the scale effect.This phenomenon is probably induced by the mosaic of the TASI LST stripes.
The downscaled TASI LSTs were evaluated.At the 15-m resolution, the downscaled LSTs from all of the models with the scale effect retained have very similar MBD and RMSD values: MBD ranges from 0.01 K to 0.04 K, which indicates that no systematic difference exists, and RMSD ranges from 3.26 K to 3.33 K.The large RMSD are caused by the co-registration error between the ASTER derived descriptors and the TASI LST.At the 90-m resolution, the downscaled LSTs that have been upscaled back to 90 m also yield no systematic difference, and the RMSD values are approximately 1.0 K.The Q values are approximately 0.96, which demonstrates that good image qualities have been obtained.With regard to the scale effect being removed, the downscaled LSTs from GRM and LRM possess slight improvements in their accuracies; for PRM, the downscaling LSTs exhibit deteriorated accuracies.Removing the scale effect shows no evident influences on the Q values.
In contrast to GRM, PRM and LRM show no evident advantages in LST downscaling in our study area.In actual applications, GRM is the simplest to implement [13], PRM could have difficulty in obtaining an accurate regression function when NDVI < 0.2, and LRM is extremely time-consuming in searching the optimal window to train the regression functions.Therefore, GRM is selected for further examination.
According to Equation ( 9), the slopes of the descriptors at the target resolution are required to quantify the scale effect, but they are unavailable in actual applications.Based on the upscaled datasets, we find that there is a significant correlation between the slopes and the standard deviations (STD) of the descriptors.As shown in Figure 5, the slopes of NDVI in both Case A and Case B can be linearly predicted by NDVI STD in the resolution range of 510 to 90 m, although the correlation decreases at resolutions that are coarser than 510 m.R 2 values are 0.699 in Case A and 0.769 in Case B (Figure 5a).In Case B, the slope of albedo in the range of 510 to 90 m can be predicted with a quadratic function of albedo STD, and the corresponding R 2 is 0.897 (Figure 5b).These regression functions between the slopes and STDs of the descriptors can be extrapolated to resolutions that are finer than 90 m, although the correlations exhibit slight changes.As shown in Figure 6, the estimated scale effects have very similar spatial distributions as the scale effects that are quantified with the LST at the target resolution.Nevertheless, in Case A, the estimated scale effect is weaker than that in Figure 4 because the slope-NDVI STD function yields underestimations of the slopes in Case A (Figure 5a); in Case B, the estimated scale effect is more significant than that in Figure 4 due to the influences of NDVI and albedo.
Remote Sens. 2016, 8, 975 11 of 24 possess slight improvements in their accuracies; for PRM, the downscaling LSTs exhibit deteriorated accuracies.Removing the scale effect shows no evident influences on the Q values.
In contrast to GRM, PRM and LRM show no evident advantages in LST downscaling in our study area.In actual applications, GRM is the simplest to implement [13], PRM could have difficulty in obtaining an accurate regression function when NDVI < 0.2, and LRM is extremely time-consuming in searching the optimal window to train the regression functions.Therefore, GRM is selected for further examination.
According to Equation ( 9), the slopes of the descriptors at the target resolution are required to quantify the scale effect, but they are unavailable in actual applications.Based on the upscaled datasets, we find that there is a significant correlation between the slopes and the standard deviations (STD) of the descriptors.As shown in Figure 5, the slopes of NDVI in both Case A and Case B can be linearly predicted by NDVI STD in the resolution range of 510 to 90 m, although the correlation decreases at resolutions that are coarser than 510 m.R 2 values are 0.699 in Case A and 0.769 in Case B (Figure 5a).In Case B, the slope of albedo in the range of 510 to 90 m can be predicted with a quadratic function of albedo STD, and the corresponding R 2 is 0.897 (Figure 5b).These regression functions between the slopes and STDs of the descriptors can be extrapolated to resolutions that are finer than 90 m, although the correlations exhibit slight changes.As shown in Figure 6, the estimated scale effects have very similar spatial distributions as the scale effects that are quantified with the LST at the target resolution.Nevertheless, in Case A, the estimated scale effect is weaker than that in Figure 4 because the slope-NDVI STD function yields underestimations of the slopes in Case A (Figure 5a); in Case B, the estimated scale effect is more significant than that in Figure 4 due to the influences of NDVI and albedo.possess slight improvements in their accuracies; for PRM, the downscaling LSTs exhibit deteriorated accuracies.Removing the scale effect shows no evident influences on the Q values.
In contrast to GRM, PRM and LRM show no evident advantages in LST downscaling in our study area.In actual applications, GRM is the simplest to implement [13], PRM could have difficulty in obtaining an accurate regression function when NDVI < 0.2, and LRM is extremely time-consuming in searching the optimal window to train the regression functions.Therefore, GRM is selected for further examination.
According to Equation ( 9), the slopes of the descriptors at the target resolution are required to quantify the scale effect, but they are unavailable in actual applications.Based on the upscaled datasets, we find that there is a significant correlation between the slopes and the standard deviations (STD) of the descriptors.As shown in Figure 5, the slopes of NDVI in both Case A and Case B can be linearly predicted by NDVI STD in the resolution range of 510 to 90 m, although the correlation decreases at resolutions that are coarser than 510 m.R 2 values are 0.699 in Case A and 0.769 in Case B (Figure 5a).In Case B, the slope of albedo in the range of 510 to 90 m can be predicted with a quadratic function of albedo STD, and the corresponding R 2 is 0.897 (Figure 5b).These regression functions between the slopes and STDs of the descriptors can be extrapolated to resolutions that are finer than 90 m, although the correlations exhibit slight changes.As shown in Figure 6, the estimated scale effects have very similar spatial distributions as the scale effects that are quantified with the LST at the target resolution.Nevertheless, in Case A, the estimated scale effect is weaker than that in Figure 4 because the slope-NDVI STD function yields underestimations of the slopes in Case A (Figure 5a); in Case B, the estimated scale effect is more significant than that in Figure 4 due to the influences of NDVI and albedo.

Scale Effect Quantified with ASTER Data
To predict the slopes of the descriptors in downscaling ASTER LSTs, the correlation between the slopes and STDs were analyzed.The results are shown in Table 1.In general, a significant correlation is found on all dates in both Case A and Case B, except for albedo in Case B on 10 July and 12 September.In both Case A and Case B, the correlation between the slope and STD of NDVI is positive, and the relationship can be sufficiently described by a linear function, which demonstrates that the slope increases at a finer resolution.This finding also confirms the existing scale effect in the LST downscaling.The slope of albedo also correlates to albedo STD, although the forms of the functions are diverse on different dates.A closer look at Table 1 shows that the slopes of NDVI in Case A and Case B are similar from 15 June to 12 September, during which the correlations between the slope and STD of albedo are relatively weak.One reason is that the surface albedo is less capable of explaining the spatial variation of LST at the native resolution on these dates than on 30 May, 19 September, and 28 September.For example, NDVI and albedo, respectively, can explain 36.6% and 22.7% of the LST variation on 30 May, 48.9% and 5% on 19 September, and 23.3% and 7.2% on 28 September; the portion of the LST variation that is explained by albedo decreases to 0.1% on 10 July and 0.3% on 12 September.During this period, albedo is not a necessary descriptor.The ASTER LSTs on the 12 dates were downscaled from 90 to 15 m.The results of GRM in Case B are shown in Figure 7.To better understand the relationship between the model performance and the crop growth, the LSTs that were downscaled from different models were validated on each date separately.Note that the results on 19 and 28 September were not validated due to a lack of in situ measured LSTs.For a fair comparison, the LSTs before downscaling were also validated.The assessments at the vegetation sites (i.e., all of the sites except for EC04) are shown in Table 2, and those for EC04 are shown in Table 3.
With regard to the vegetation sites, MBE and RMSE of the LSTs downscaled by GRM experience a decline first and then a rise.This trend is consistent with MBE and RMSE of the LSTs without downscaling.According to the DisTrad model and Equation (4), the errors of the downscaled LST are from the following sources.The first source is from the accuracy of the regression function and the abilities of the descriptors to explain the LST variation.In Case A, NDVI explains only 36.6%, 36.1%,56.5%, 72.2%, and 72.0% of the LST variation on 30 May, 15 June, 24 June, 3 September, and 12 September, respectively, whereas the generally higher LST variation can be explained by NDVI in July and August.The second source is the combined influences of the surface heterogeneity and the scale mismatch between the pixel and the ground sites.The third source is the uncertainties in the LSTs at 90 m due to uncertainties that are related to the atmospheric profile, surface emissivity, and LST retrieval algorithm.uncertainties in the LSTs at 90 m due to uncertainties that are related to the atmospheric profile, surface emissivity, and LST retrieval algorithm.With regard to GRM in Case A, removing the scale effect cannot significantly improve the accuracy of the downscaled LST in general.One exception is 30 May, when the MBE/RMSE in Case A decrease by 1.03/0.90K.This improvement in the accuracy results from the positive contribution of the scale effect over barren or lowly vegetated surfaces (Figure 8) and the overestimation of the original ASTER LST (Table 2).When the vegetation abundance increases, the scale effect becomes negative; removing the negative scale effect induces increasing MBE and RMSE because the downscaling model overestimates the LST.This phenomenon can be found on 24 June, 10 July, 18 August, 27 August, 3 September, and 12 September.An exception is 2 August: GRM underestimates the LST (MBE/RMSE: −0.51/0.84K); thus, removing the negative scale effect improves the downscaling accuracy (MBE/RMSE: −0.07/0.60K).When the albedo is selected as the second descriptor (i.e., Case B), the downscaling accuracies on a few dates are higher than in Case A. Removing the scale effect improves the downscaling accuracies on 30 May and 2 August.With regard to GRM in Case A, removing the scale effect cannot significantly improve the accuracy of the downscaled LST in general.One exception is 30 May, when the MBE/RMSE in Case A decrease by 1.03/0.90K.This improvement in the accuracy results from the positive contribution of the scale effect over barren or lowly vegetated surfaces (Figure 8) and the overestimation of the original ASTER LST (Table 2).When the vegetation abundance increases, the scale effect becomes negative; removing the negative scale effect induces increasing MBE and RMSE because the downscaling model overestimates the LST.This phenomenon can be found on 24 June, 10 July, 18 August, 27 August, 3 September, and 12 September.An exception is 2 August: GRM underestimates the LST (MBE/RMSE: −0.51/0.84K); thus, removing the negative scale effect improves the downscaling accuracy (MBE/RMSE: −0.07/0.60K).When the albedo is selected as the second descriptor (i.e., Case B), the downscaling accuracies on a few dates are higher than in Case A. Removing the scale effect improves the downscaling accuracies on 30 May and 2 August.In contrast to the vegetation sites, the original ASTER LST and downscaled LST at EC04 exhibit larger and negative errors (Table 3).The main reasons are the scale mismatch between the ground site and its corresponding pixel and the significant overestimation of the surface emissivity due to the heterogeneous underlying surface [40].Compared with the original LST, the downscaling can partly mitigate the scale mismatch problem.The overall MBE/RMSE decrease from −5.15/5.58K to −4.67/4.98K in Case A. The albedo can further improve the downscaling accuracy.In Case B, MBE/RMSE are −4.57/4.84K, respectively.However, removing the scale effect deteriorates the accuracies of the downscaled LST.EC04 possesses positive scale effects on all of the 12 dates.Thus, removing the positive scale effect from the underestimated LST after downscaling further underestimates the LST.
Q calculated with the original ASTER LSTs and the downscaled LSTs that have been upscaled back to 90 m are shown in Table 4. Q demonstrates that GRM in both Case A and Case B can capture the LST spatial patterns in the study area.On a few dates, selecting albedo as the second descriptor contributes to a slight improvement on the downscaled LST images.As shown for all of the 12 dates, removing the scale effect deteriorates the qualities of the downscaled LST images, and the Q values in Case B are lower than those in Case A. On the one hand, the spatial pattern of the scale effect in GRM depends on the landscape of the study area, and densely/lowly vegetated surfaces possess negative/positive scale effects (Figure 8).On the other hand, the downscaled LST has a similar spatial pattern as the original LST as demonstrated by the high Q values.Thus, removing the scale effect from the downscaled LST alters the spatial patterns of the LST and degrades the quality of the LST image.With the LST and descriptors simulated by Tiangong-1 and OLI data at different resolutions, the linear regression functions of the LST-descriptors were trained.Figure 9a shows R 2 values of the function of LST-NDVI: R 2 is the lowest at 30 m (0.749) and increases rapidly from 30 m to 240 m, at which R 2 is maximum (0.827) and R 2 varies slightly after 240 m.All of the regressions are significant at the 0.001 level.R 2 of the function of LST-FVC is slightly lower than LST-NDVI, which indicates that FVC is less capable of explaining the spatial variation of LST than NDVI in the study area.However, selecting NDBI as the second descriptor can better explain the LST variation: R 2 of the regression function is 0.783 at 30 m and 0.874 at 210 m and 240 m.Similar to in the Zhangye oasis, the mean values of the parameters have ignorable variations at different resolutions: 322.19 ± 0.07 K for LST, 0.485 ± 0.004 for NDVI, and −0.274 ± 0.002 for NDBI.Therefore, the transformation of Equation (5) Remote Sens. 2016, 8, 975 16 of 24 to Equation ( 9) is valid.Furthermore, these two equations result in the same scale effects.The scale effects at four NDVI values are shown in Figure 9b.As expected, the scale effect is ignorable when NDVI = 0.5 because 0.5 is closed to the mean value of NDVI.At the other NDVI values, the scale effect experiences a rise first, reaches a peak at approximately 150 to 180 m and then shows a decline, which is due to similar variations in the slope.The scale effect is close to zero when the resolution is 600-630 m and exhibits fluctuations at coarser resolutions.The variation in the scale effect indicates that it depends on the ratio of the native resolution to the target resolution.Similar to the findings in the Zhangye oasis, the scale effect is positive when NDVI is lower than the mean value, and vice versa.
Remote Sens. 2016, 8, 975 16 of 24 to Equation ( 9) is valid.Furthermore, these two equations result in the same scale effects.The scale effects at four NDVI values are shown in Figure 9b.As expected, the scale effect is ignorable when NDVI = 0.5 because 0.5 is closed to the mean value of NDVI.At the other NDVI values, the scale effect experiences a rise first, reaches a peak at approximately 150 to 180 m and then shows a decline, which is due to similar variations in the slope.The scale effect is close to zero when the resolution is 600-630 m and exhibits fluctuations at coarser resolutions.The variation in the scale effect indicates that it depends on the ratio of the native resolution to the target resolution.Similar to the findings in the Zhangye oasis, the scale effect is positive when NDVI is lower than the mean value, and vice versa.The scale effects of GRM, PRM, LRM-MCD, and LRM-LPR are quantified, and they are shown in Figure 10 by images.For GRM, the scale effect varies in the ranges of −0.38 to 0.36 K in Case A and −1.90 to 1.89 K in Case B. The spatial pattern of the scale effect of GRM in Case A correlates to the spatial distributions of the land cover types: the scale effect is positive over built-up areas and negative in highly vegetated surfaces.The scale effect of GRM in Case B is much more significant than in Case A due to the influences from both NDVI and NDBI.The spatial patterns of the scale effect of PRM in Case A and Case B depend on the spatial distributions of different NDVI groups.The scale effect of PRM in Case A is slightly higher than GRM in Case A, and in addition, is slightly higher than that of PRM in Case B. In contrast to GRM and PRM, LRM-MCD and LRM-LPR possess more significant scale effects.In addition, the spatial distributions of the scale effects are more irregular than GRM and PRM due to the varying optimal window size for each pixel in LRM-MCD and LRM-LPR.
In downscaling the Tiangong-1 LST, GRM has better accuracy than PRM, LRM-MCD, and LRM-LPR.For the 30-m resolution, RMSD of GRM is 0.30-0.37K, which is 0.32-0.34K lower than that of PRM and LRM in Case A and Case B, respectively.At the 100-m resolution, GRM also has slightly higher accuracy than the other downscaling models and similar image quality according to the Q value.In addition, selecting NDBI as the second descriptor in such a suburban area improves the downscaling accuracy.Removing the scale effect in the downscaling has a very weak contribution on improving the accuracy of the downscaled LST.Considering that the GRM is simple to implement in actual applications, it is selected for further investigation.The scale effects of GRM, PRM, LRM-MCD, and LRM-LPR are quantified, and they are shown in Figure 10 by images.For GRM, the scale effect varies in the ranges of −0.38 to 0.36 K in Case A and −1.90 to 1.89 K in Case B. The spatial pattern of the scale effect of GRM in Case A correlates to the spatial distributions of the land cover types: the scale effect is positive over built-up areas and negative in highly vegetated surfaces.The scale effect of GRM in Case B is much more significant than in Case A due to the influences from both NDVI and NDBI.The spatial patterns of the scale effect of PRM in Case A and Case B depend on the spatial distributions of different NDVI groups.The scale effect of PRM in Case A is slightly higher than GRM in Case A, and in addition, is slightly higher than that of PRM in Case B. In contrast to GRM and PRM, LRM-MCD and LRM-LPR possess more significant scale effects.In addition, the spatial distributions of the scale effects are more irregular than GRM and PRM due to the varying optimal window size for each pixel in LRM-MCD and LRM-LPR.
In downscaling the Tiangong-1 LST, GRM has better accuracy than PRM, LRM-MCD, and LRM-LPR.For the 30-m resolution, RMSD of GRM is 0.30-0.37K, which is 0.32-0.34K lower than that of PRM and LRM in Case A and Case B, respectively.At the 100-m resolution, GRM also has slightly higher accuracy than the other downscaling models and similar image quality according to the Q value.In addition, selecting NDBI as the second descriptor in such a suburban area improves the downscaling accuracy.Removing the scale effect in the downscaling has a very weak contribution on improving the accuracy of the downscaled LST.Considering that the GRM is simple to implement in actual applications, it is selected for further investigation.As in the Zhangye oasis, there is also a significant correlation between the slopes and STDs of the descriptors (Figure 11).When the resolution is 100-510 m, the slope of NDVI in Case A can be predicted with a quadratic function of NDVI STD, and the slopes of NDVI and NDBI in Case B can be predicted with linear functions of NDVI STD and NDBI STD.These linear/quadratic functions can be extrapolated to resolutions that are finer than 100 m (i.e., 90 m, 60 m, and 30 m).In Case A, the quadratic function for the slope of NDVI slightly underestimates the slope at 30 m (i.e., −0.245) and subsequently underestimates the absolute value of the slope difference between the native resolution and the target resolution; thus, the estimated slope of NDVI induces a slight underestimation of the range of the scale effect (Figure 12a and 10a).In Case B, the linear functions for the slopes of NDVI and NDBI overestimates/underestimates the corresponding slopes at the 30-m resolution; thus, the estimated slopes of NDVI and NDBI slightly underestimate the range of the scale effect (Figure 12b and 10b).As in the Zhangye oasis, there is also a significant correlation between the slopes and STDs of the descriptors (Figure 11).When the resolution is 100-510 m, the slope of NDVI in Case A can be predicted with a quadratic function of NDVI STD, and the slopes of NDVI and NDBI in Case B can be predicted with linear functions of NDVI STD and NDBI STD.These linear/quadratic functions can be extrapolated to resolutions that are finer than 100 m (i.e., 90 m, 60 m, and 30 m).In Case A, the quadratic function for the slope of NDVI slightly underestimates the slope at 30 m (i.e., −0.245) and subsequently underestimates the absolute value of the slope difference between the native resolution and the target resolution; thus, the estimated slope of NDVI induces a slight underestimation of the range of the scale effect (Figures 10a and 12a).In Case B, the linear functions for the slopes of NDVI and NDBI overestimates/underestimates the corresponding slopes at the 30-m resolution; thus, the estimated slopes of NDVI and NDBI slightly underestimate the range of the scale effect (Figures 10b  and 12b).As in the Zhangye oasis, there is also a significant correlation between the slopes and STDs of the descriptors (Figure 11).When the resolution is 100-510 m, the slope of NDVI in Case A can be predicted with a quadratic function of NDVI STD, and the slopes of NDVI and NDBI in Case B can be predicted with linear functions of NDVI STD and NDBI STD.These linear/quadratic functions can be extrapolated to resolutions that are finer than 100 m (i.e., 90 m, 60 m, and 30 m).In Case A, the quadratic function for the slope of NDVI slightly underestimates the slope at 30 m (i.e., −0.245) and subsequently underestimates the absolute value of the slope difference between the native resolution and the target resolution; thus, the estimated slope of NDVI induces a slight underestimation of the range of the scale effect (Figure 12a and 10a).In Case B, the linear functions for the slopes of NDVI and NDBI overestimates/underestimates the corresponding slopes at the 30-m resolution; thus, the estimated slopes of NDVI and NDBI slightly underestimate the range of the scale effect (Figure 12b and 10b).The TIRS LST at 100 m was downscaled to 30 m with GRM in both Case A and Case B. The results are shown in Figure 13.It is obvious that the downscaled LSTs have very similar spatial patterns to the LST without downscaling.However, the downscaled LSTs can provide much more thermal details.Because the scale effect is weak, removing the scale effect has a weak alternation on the spatial pattern of LST (Figure 13b,c).The downscaled LSTs were upscaled back to 100 m and compared against the original TIRS LST.The evaluation results are shown in Table 6.As demonstrated by Table 6, selecting NDBI as the second descriptor can slightly improve the accuracy and image quality of the downscaled LST.Removing the scale effect contributes a very slight improvement to the accuracy and image quality of the downscaled LST.

Scale Effect Quantified with TIRS Data
Based on the Landsat-8 TIRS LST and the corresponding descriptors, we find that there is a significant linear correlation between the slopes of the descriptors and their STDs in the resolution range of 100-510 m.The regression functions are shown in Table 5.These regression functions are used to predict the slopes of the descriptors at the 30-m resolution.The subsequently estimated scale effects in Case A and Case B are shown in Figure 12c,d.They have very similar spatial patterns to the quantified scale effects in downscaling the Tiangong-1 simulated LST at the 100-m resolution (Figure 10a,b).In general, built-up surfaces possess a positive scale effect, while vegetated surfaces possess a negative scale effect.The TIRS LST at 100 m was downscaled to 30 m with GRM in both Case A and Case B. The results are shown in Figure 13.It is obvious that the downscaled LSTs have very similar spatial patterns to the LST without downscaling.However, the downscaled LSTs can provide much more thermal details.Because the scale effect is weak, removing the scale effect has a weak alternation on the spatial pattern of LST (Figure 13b,c).The downscaled LSTs were upscaled back to 100 m and compared against the original TIRS LST.The evaluation results are shown in Table 6.As demonstrated by Table 6, selecting NDBI as the second descriptor can slightly improve the accuracy and image quality of the downscaled LST.Removing the scale effect contributes a very slight improvement to the accuracy and image quality of the downscaled LST.

Scale Effect and Its Influencing Factors
By concentrating on the downscaling LST, we examined the scale effect, which is commonly ignored based on the assumption of the scale-invariant LST-descriptors relationship.In two study areas that have typical landscapes, we quantified the scale effect in downscaling ASTER LST from 90 to 15 m and TIRS LST from 100 to 30 m.This study confirms the finding of the scale effect reported by [24] and [30] in agricultural areas in India and by [29] in grasslands and agricultural areas in China.One evident difference between our study and the other studies is that our purpose is to downscale satellite LST with a medium resolution (~100 m) to high resolutions, while the target resolutions of the other studies are much coarser.Thus, it is reasonable to infer that the scale effect is an extensively existing phenomenon in LST downscaling and the assumption of the scale-invariant LST-descriptors relationship is conditional.
From the good correlation between the slopes of the descriptors in the linear regression for downscaling LST and the STDs of the descriptors, it is evident that the scale effect is mainly caused by the varying probability distribution of the LST and its descriptors at the native and target resolutions.A comparison between the experiments in the Zhangye oasis and Beijing city demonstrates that the scale effect depends on the ratio of the native resolution to the target resolution.The ratios are 6 for ASTER in the Zhangye oasis and 3.33 for TIRS in Beijing city; thus, the scale effect is more significant in the first study area.Similarly, Ghosh and Joshi (2014) reported that a ratio that exceeds 4 could cause a significant scale effect [30].The scale effect also depends on the phenology based on our examination in the complete crop growing season in the Zhangye oasis.

The Model for Quantifying the Scale Effect
In addition to the high target resolution, another evident difference between this study and the other studies is that the scale effect is interpreted mathematically and theoretically.Based on the linear DisTrad model, a generalized model for quantifying the scale effect was proposed, namely, Equation (5), and it was further simplified to Equation (9).Verification based on the simulation datasets generated from the airborne TASI LST and Chinese Tiangong-1 LST demonstrates that this model is valid.We also proposed a simple method to estimate the slopes at the target resolution through extrapolating the empirical functions between the slopes and STDs of the descriptors trained at the native and coarser resolutions.Therefore, users in actual applications can estimate the scale effect in the LST downscaling.
In contrast to the linear function in LST downscaling, the quadratic function was not considered in this study because it is sensitive to the outliers.However, the proposed model for quantifying the

Scale Effect and Its Influencing Factors
By concentrating on the downscaling LST, we examined the scale effect, which is commonly ignored based on the assumption of the scale-invariant LST-descriptors relationship.In two study areas that have typical landscapes, we quantified the scale effect in downscaling ASTER LST from 90 to 15 m and TIRS LST from 100 to 30 m.This study confirms the finding of the scale effect reported by [24] and [30] in agricultural areas in India and by [29] in grasslands and agricultural areas in China.One evident difference between our study and the other studies is that our purpose is to downscale satellite LST with a medium resolution (~100 m) to high resolutions, while the target resolutions of the other studies are much coarser.Thus, it is reasonable to infer that the scale effect is an extensively existing phenomenon in LST downscaling and the assumption of the scale-invariant LST-descriptors relationship is conditional.
From the good correlation between the slopes of the descriptors in the linear regression for downscaling LST and the STDs of the descriptors, it is evident that the scale effect is mainly caused by the varying probability distribution of the LST and its descriptors at the native and target resolutions.A comparison between the experiments in the Zhangye oasis and Beijing city demonstrates that the scale effect depends on the ratio of the native resolution to the target resolution.The ratios are 6 for ASTER in the Zhangye oasis and 3.33 for TIRS in Beijing city; thus, the scale effect is more significant in the first study area.Similarly, Ghosh and Joshi (2014) reported that a ratio that exceeds 4 could cause a significant scale effect [30].The scale effect also depends on the phenology based on our examination in the complete crop growing season in the Zhangye oasis.

The Model for Quantifying the Scale Effect
In addition to the high target resolution, another evident difference between this study and the other studies is that the scale effect is interpreted mathematically and theoretically.Based on the linear DisTrad model, a generalized model for quantifying the scale effect was proposed, namely, Equation (5), and it was further simplified to Equation (9).Verification based on the simulation datasets generated from the airborne TASI LST and Chinese Tiangong-1 LST demonstrates that this model is valid.We also proposed a simple method to estimate the slopes at the target resolution through extrapolating the empirical functions between the slopes and STDs of the descriptors trained at the native and coarser resolutions.Therefore, users in actual applications can estimate the scale effect in the LST downscaling.
In contrast to the linear function in LST downscaling, the quadratic function was not considered in this study because it is sensitive to the outliers.However, the proposed model for quantifying the scale effect can be adapted to the quadratic function after simplifying it to a linear function through treating the quadratic term as a new 'descriptor'.The proposed model is limited when it is adapted to a downscaling method of the artificial intelligence, e.g., Artificial Neural Network (ANN) [22] and Support Vector Machine (SVM) [30].This issue is an area of on-going research.

Scale Effect and Accuracy of the Downscaled LST
Although the scale effect could be significant in some cases, removing it would not necessarily improve the accuracy of the downscaled LST.From the experiments in the two study areas, removing the scale effect slightly improves the accuracy of the downscaled LST, when the downscaled LST is compared against the remotely sensed LSTs.When being validated against the in situ LSTs, the improvement in the accuracy is found to be diverse on different dates.Removing the scale effect yields higher accuracy on 2 out of 10 dates: one in the early stage of the crop growth and one in the middle stage.
By comparing the accuracies of the LST before and after downscaling, it is understandable that the accuracy after removing the scale effect is closely related to the error of the original LST and the downscaling process.The error of the original LST is subject to sources such as the parameterization of atmospheric influences and the estimation of land surface emissivity.This error is usually lower over relatively homogeneous surfaces (approximately 0.5 to 2.0 K), while it is considerably higher over heterogeneous surfaces [2], which is confirmed by the validation based on the in situ LSTs in the Zhangye oasis.The error of the downscaling process usually ranges from 0.5 K to 3.0 K, depending on the selected descriptors and regression tool as well as the areas where downscaling LST is performed [13].Additionally, the co-registration among the coarse resolution LSTs, the fine resolution descriptors, and the fine resolution LSTs used for evaluation also cause uncertainties.Therefore, if the error contributed from the scale effect and the sum of the other errors have the same sign, then removing the scale effect could improve the accuracy of the downscaled LST; otherwise, removing the scale effect could increase the error of the downscaled LST.

Significance of this Study
Direct LST mapping at a high resolution (e.g., <50 m) is desirable in many applications, such as in monitoring the sub-field surface ET and managing the water resources [10,35] and surface UHI estimation [11].To the best of our knowledge, such studies mainly depend on the airborne thermal remote sensing data [19].LST downscaling provides an effective way to obtain LST with a high resolution; however, most of the current related studies concentrate on downscaling LST from low resolutions (~1000 m) to medium resolutions (~100 m).This study reports downscaling LST from a medium resolution (~100 m) to high resolutions (15 m for ASTER and 30 m for TIRS) in a typical agricultural area and a suburban area.This study can provide a new understanding of downscaling the LST.Furthermore, the scale effect model reported in this study is expected to have good generalization to different native/target resolutions because it is developed theoretically.In addition to LST, this model can be extended to the downscaling of other parameters, such as the precipitation and NDVI [22,56].

Conclusions
An in-depth investigation into the basic assumption of the scale-invariant LST-descriptors relationship can facilitate the development of LST downscaling methods.Although there have been a few studies on this topic, it is still unclear, especially with regard to downscaling LSTs to high resolutions.By selecting an agricultural oasis (the Zhangye oasis) and a suburban area (in Beijing city) as the study areas, we present a new interpretation of this problem by employing four statistical downscaling models.
The airborne TASI LST with a 3-m resolution in the Zhangye oasis and the Chinese Tiangong-1 space station LST with a 10-m resolution in Beijing city were used to generate a series of simulation datasets.Then, the scale effect in downscaling the LST from a medium resolution (i.e., 90 m for ASTER in the Zhangye oasis and 100 m for TIRS in Beijing city) to a high resolution (i.e., 15 m for ASTER and 30 m for TIRS) was examined.We found that the scale-invariance assumption is conditional: no scale effect exists only when the values of the descriptors are equal to the mean values of the selected pixels in training the LST-descriptors function at the native resolution.Despite the different magnitudes, the scale effect exists in both of these two study areas.In general, the global regression model (GRM) and piecewise regression model (PRM) suffer weaker scale effects than the two local regression models (LRM-MCD and LRM-LPR).
A simple analytical model was developed to quantify the scale effect mathematically and theoretically.The validity of this model was verified by the simulation datasets and was further tested based on the actual ASTER and TIRS data in the two study areas.The results demonstrate that the scale effect is intrinsically caused by the varying probability distribution of the LST and its descriptors at the native and target resolutions.It depends on the values of the descriptors, the phenology, and the ratio of the native resolution to the target resolution.In both of these two study areas, removing the scale effect can only slightly improve the accuracy of the downscaled LST, depending on the errors involved in the original LST and the downscaling process.Nevertheless, this study facilitates the understanding of the scale effect in the LST downscaling and also provides implications for downscaling the LST from a low resolution to a medium resolution and other surface parameters.

Figure1.
Figure1.Locations and maps of the two study areas.(a) Locations of the study areas in China; (b) False color composite of the ASTER image of the Zhangye oasis acquired on 10 July 2012 (R/G/B = 3N/2/1); (c) Locations of ground sites; (d) False color composites of the OLI image of a suburban area of Beijing city acquired on 31 July 2013 (R/G/B = 5/4/3); (e) Tiangong-1 TIR image covering the same area as (d) acquired on 30 July 2013.

Figure 1 .
Figure 1.Locations and maps of the two study areas.(a) Locations of the study areas in China; (b) False color composite of the ASTER image of the Zhangye oasis acquired on 10 July 2012 (R/G/B = 3N/2/1); (c) Locations of ground sites; (d) False color composites of the OLI image of a suburban area of Beijing city acquired on 31 July 2013 (R/G/B = 5/4/3); (e) Tiangong-1 TIR image covering the same area as (d) acquired on 30 July 2013.

Figure 2 .
Figure 2. Flowchart of this study.Figure 2. Flowchart of this study.

Figure 2 .
Figure 2. Flowchart of this study.Figure 2. Flowchart of this study.

4 . 1 .
In the Zhangye Oasis 4.1.1.Scale Effect Quantified with TASI Data With the TASI simulated LSTs and ASTER derived parameters, the linear regression functions were trained at different resolutions.The R 2 values of the function of LST-NDVI are shown in Figure 3a.R 2 increases from 0.545 at 15 m to 0.824 at 90 m and remains stable at approximately 0.86-0.89after 150 m.All of the regressions are significant at the 0.001 level.Higher regression accuracy is obtained at resolutions that are coarser than 15 m due to having fewer pixel outliers.In contrast, the function of LST-FVC yields lower regression accuracy: R 2 increases from 0.530 at 15 m to 0.803 at 90 m and is ~0.84 after 150 m, which demonstrates that FVC is less capable of explaining the spatial variation of LST than NDVI in this study area on July 10.Selecting the surface albedo as the second descriptor contributes only a slight increment to the regression accuracy.

Figure 3 .
Figure 3. (a) R 2 in training the regression function of LST-NDVI and (b) the quantified scale effect at different NDVI values based on the TASI simulated LSTs.

Figure 3 .
Figure 3. (a) R 2 in training the regression function of LST-NDVI and (b) the quantified scale effect at different NDVI values based on the TASI simulated LSTs.
3.26 K to 3.33 K.The large RMSD are caused by the co-registration error between the ASTER derived descriptors and the TASI LST.At the 90-m resolution, the downscaled LSTs that have been upscaled back to 90 m also yield no systematic difference, and the RMSD values are approximately 1.0 K.The Q values are approximately 0.96, which demonstrates that good image qualities have been obtained.With regard to the scale effect being removed, the downscaled LSTs from GRM and LRM

Figure 5 .
Figure 5. (a,b) The relationship between the slopes of the LST-descriptors functions and the STDs of the descriptors in the Zhangye oasis.The regressions are significant at the 0.001 level.The LSTs were simulated from the TASI LST on 10 July 2012.

Figure 6 .
Figure 6.(a,b) The estimated scale effect in downscaling the TASI simulated LST with GRM.

Figure 5 .
Figure 5. (a,b) The relationship between the slopes of the LST-descriptors functions and the STDs of the descriptors in the Zhangye oasis.The regressions are significant at the 0.001 level.The LSTs were simulated from the TASI LST on 10 July 2012.

Figure 5 .
Figure 5. (a,b) The relationship between the slopes of the LST-descriptors functions and the STDs of the descriptors in the Zhangye oasis.The regressions are significant at the 0.001 level.The LSTs were simulated from the TASI LST on 10 July 2012.

Figure 6 .
Figure 6.(a,b) The estimated scale effect in downscaling the TASI simulated LST with GRM.Figure 6. (a,b) The estimated scale effect in downscaling the TASI simulated LST with GRM.

Figure 6 .
Figure 6.(a,b) The estimated scale effect in downscaling the TASI simulated LST with GRM.Figure 6. (a,b) The estimated scale effect in downscaling the TASI simulated LST with GRM.

Figure 7 .
Figure 7. (a-l) The ASTER LSTs downscaled from 90 to 15 m with GRM in Case B during the complete crop growing season in the Zhangye oasis and its surrounding areas.LSTs of the areas outside the oasis were downscaled by uniform disaggregation.

Figure 7 .
Figure 7. (a-l) The ASTER LSTs downscaled from 90 to 15 m with GRM in Case B during the complete crop growing season in the Zhangye oasis and its surrounding areas.LSTs of the areas outside the oasis were downscaled by uniform disaggregation.

Figure 8 .
Figure 8. (a-l) The estimated scale effects in downscaling ASTER LST with GRM in Case A during the complete crop growing season in the Zhangye oasis.

Figure 8 .
Figure 8. (a-l) The estimated scale effects in downscaling ASTER LST with GRM in Case A during the complete crop growing season in the Zhangye oasis.

Figure 9 .
Figure 9. (a) R 2 in training the regression function of LST-NDVI and (b) the quantified scale effect at different NDVI values based on the Tiangong-1 simulated LSTs.

Figure 9 .
Figure 9. (a) R 2 in training the regression function of LST-NDVI and (b) the quantified scale effect at different NDVI values based on the Tiangong-1 simulated LSTs.

Figure 11 .
Figure 11.(a,b) The relationship between the slopes of the LST-descriptors functions and the STDs of the descriptors in Beijing city.The regressions are significant at the 0.001 level.The LSTs were simulated from the Tiangong-1 LST on 30 July 2013.

Figure 11 .
Figure 11.(a,b) The relationship between the slopes of the LST-descriptors functions and the STDs of the descriptors in Beijing city.The regressions are significant at the 0.001 level.The LSTs were simulated from the Tiangong-1 LST on 30 July 2013.

Figure 11 .
Figure 11.(a,b) The relationship between the slopes of the LST-descriptors functions and the STDs of the descriptors in Beijing city.The regressions are significant at the 0.001 level.The LSTs were simulated from the Tiangong-1 LST on 30 July 2013.

Figure 12 .
Figure 12. (a-d) The estimated scale effects in downscaling the Tiangong-1simulated LST and TIRS LST with GRM in Case A and Case B.

4. 2 . 2 .
Scale Effect Quantified with TIRS DataBased on the Landsat-8 TIRS LST and the corresponding descriptors, we find that there is a significant linear correlation between the slopes of the descriptors and their STDs in the resolution range of 100-510 m.The regression functions are shown in Table5.These regression functions are used to predict the slopes of the descriptors at the 30-m resolution.The subsequently estimated scale effects in Case A and Case B are shown in Figure12c,d.They have very similar spatial patterns to the quantified scale effects in downscaling the Tiangong-1 simulated LST at the 100-m resolution (Figure10a,b).In general, built-up surfaces possess a positive scale effect, while vegetated surfaces possess a negative scale effect.

Figure 12 .
Figure 12. (a-d) The estimated scale effects in downscaling the Tiangong-1simulated LST and TIRS LST with GRM in Case A and Case B.

Figure 13 .
Figure 13.(a-c) TIRS LST at 100 m and LSTs downscaled from 100 m to 30 m with GRM in Case A in Beijing city on 31 July 2013.Surface temperatures of the lake were downscaled by uniform disaggregation.

Figure 13 .
Figure 13.(a) TIRS LST at 100 m and (b,c) LSTs downscaled from 100 m to 30 m with GRM in Case A in Beijing city on 31 July 2013.Surface temperatures of the lake were downscaled by uniform disaggregation.

Table 1 .
The functions for predicting the slopes of descriptors in the regression functions for downscaling the ASTER LSTs with GRM on the 12 dates.
V and σ A are the STDs of NDVI and albedo; 1 the regression is not significant at the 0.05 level.

Table 2 .
MBE and RMSE values of the LSTs downscaled with GRM validated based on in situ LSTs at all sites except EC04.

Table 2 .
MBE and RMSE values of the LSTs downscaled with GRM validated based on in situ LSTs at all sites except EC04.

Table 3 .
Errors of the LSTs downscaled with GRM validated based on in situ LSTs at EC04.

Table 3 .
Errors of the LSTs downscaled with GRM validated based on in situ LSTs at EC04.

Table 4 .
Q calculated with the original ASTER LSTs and the downscaled LSTs that have been upscaled back to 90 m.

Table 5 .
The functions for predicting the slopes of descriptors in the regression functions for downscaling Landsat-8 TIRS LST with GRM on 31 July 2013.σV and σB are the STDs of NDVI and NDBI; the regressions are significant at the 0.01 level; the sample size in regression is 15.

Table 6 .
MBD, RMSD, and Q values in evaluating the downscaled TIRS LST based on the original TIRS LST at the 100 m resolution.

Table 5 .
The functions for predicting the slopes of descriptors in the regression functions for downscaling Landsat-8 TIRS LST with GRM on 31 July 2013.
Notes: σ V and σ B are the STDs of NDVI and NDBI; the regressions are significant at the 0.01 level; the sample size in regression is 15.

Table 6 .
MBD, RMSD, and Q values in evaluating the downscaled TIRS LST based on the original TIRS LST at the 100 m resolution.