Reconstruction of the Daily MODIS Land Surface Temperature Product Using the Two-Step Improved Similar Pixels Method

The MODIS land surface temperature (LST) product is one of the most widely used data sources to study the climate and energy-water cycle at a global scale. However, the large number of invalid values caused by cloud cover limits the wide application of the MODIS LST. In this study, a two-step improved similar pixels (TISP) method was proposed for cloudy sky LST reconstruction. The TISP method was validated using a temperature-based method over various land cover types. The ground measurements were collected at fifteen stations from 2013 to 2018 during the Heihe Watershed Allied Telemetry Experimental Research (HiWATER) field campaign in China. The estimated theoretical clear-sky temperature indicates that clouds cool the land surface during the daytime and warm it at nighttime. For bare land, the surface temperature shows a clear seasonal trend and very similar across stations, with a cooling amplitude of 4.14 K in the daytime and a warming amplitude of 3.99 K at nighttime, as a yearly average. The validation result showed that the reconstructed LST is highly consistent with in situ measurements and comparable with MODIS LST validation accuracy, with a mean bias of 0.15 K at night (−0.43 K in the day), mean RMSE of 2.91 K at night (4.41 K in the day), and mean R2 of 0.93 at night (0.90 in the day). The developed method maximizes the potential of obtaining quality MODIS LST retrievals, ancillary data, and in situ observations, and the results show high accuracy for most land cover types.


Introduction
Land surface temperature (LST), controlled by land-atmosphere interactions and energy fluxes, is a key parameter in the earth's climate [1] and energy-water cycle at regional and global scales [2][3][4]. LST drives and controls many biophysical processes between the hydrosphere, atmosphere, and biosphere [5,6], playing an important role in climate research and global change, such as monitoring glaciers, vegetation assessments, and terrestrial carbon flux research [7,8].
In situ derived LST data, which are assessed from various sources, such as meteorological stations and Argo floats, have the advantage of high reliability and a longer time series [9]. However, in situ data measurements are often sparsely or/and irregularly distributed, especially in polar regions and mountainous areas [10,11].
Satellite LST data, which can be easily acquired with global coverage, provides the only opportunity of observing LSTs with sufficient spatial resolution and adequate time series [3]. Over the past decade, satellite LST data have been widely used for glacial-hydroclimatological studies [11,12], permafrost mapping [13,14], environmental assessment [15], Remote Sens. 2021, 13, 1671 2 of 27 and climate change [16] in areas that lack ground-based LST data, such as the Himalayan region, the Tibetan Plateau, the Arctic, and Antarctica. The retrieval of LSTs from satellites is based on a variety of mature algorithms that have been proposed for thermal infrared (TIR) channels since the 1970s [17]. Numerous satellite LST products have been developed from a variety of sensors [18,19]. The most widely used LST products from Moderate Resolution Imaging Spectroradiometer (MODIS) on Terra and Aqua satellites retrieved by the generalized split-window algorithm [20] are available between February 2000 and July 2002, respectively. The MODIS LST products have the advantage of optimal spatial resolution (nominally 1 km), and high temporal resolution (four global coverages a day, at 10:30, 13:30, 22:30, 01:30 nominal time). The data has been validated by in situ LST measurement for various land cover types. The accuracy of MODIS LST products was better than 1 K at most stations [4]. However, cloud cover can cause frequent missing data in satellite LST data. Research confirmed that more than 60% of the Earth's surface is covered by clouds on any given day on average [21], seriously restricting its applications to clear-sky conditions [22]. Data gaps in satellite LST products impact understanding the magnitude of its trend [23] and the absolute LST value compared with in situ data [24]. The large differences in LST between clear-sky and cloudy-sky conditions limit its applications for climate change research [24,25].
To date, numerous methods have been developed for filling in LST gaps under cloudysky conditions to obtain a spatially and temporally continuous satellite LST product. Geostatistical interpolation is the most common spatially-based LST reconstruction method (e.g., the Kriging method and many improved Kriging methods) [10]. Methods based on spatial correlations of cloudy-sky pixels with neighboring clear-sky pixels can only be applied to images with few cloud-covered pixels. The second type of cloudy-sky LST reconstruction method considers correlations in time [26]. Based on the trend and seasonality of LST time series, the harmonic analysis of time series [27] and the Savitzky-Golay filter method [28,29] have been used to build a temporal model by fitting available LST time series data at each point and then apply it to reconstruct cloudy-sky pixels. However, temporal approaches work best on seasonal LST changes and smooth diurnal LST variations [30]. Thus, temporal approaches are not suitable for reconstructing the daily MODIS LST product, especially in areas with active land cover changes. Because LST is impacted by environmental factors such as vegetation and elevation, gap-filling reconstructs the cloudy pixel LST by building an empirical LST model with related environmental datasets such as normalized difference vegetation index (NDVI) and digital elevation model (DEM) [26]. The reconstructed LST is impacted by the uncertainty of these auxiliary datasets and the accuracy of the empirical LST models. These methods of deriving the theoretical clear-sky LST are subject to system bias errors.
Many methods have been developed to derive actual cloudy-sky LST. The surface energy balance-based (EB) method was proposed to recover cloudy-sky LST based on its spatially and temporally neighboring valid pixels [31,32]. This method is based on the assumption that a cloudy pixel differs in temperature from its neighboring clear-sky pixels mainly because of their different surface incident radiation and redistributions. Therefore, cloudy-sky LST can be obtained based on its spatially and temporally neighboring clear-sky LST and a temperature correction term related to the surface incident radiation, atmospheric temperature, and wind velocity, which are usually not available at a global scale. The first operational all-weather LST product for MSG/SEVIRI derived at the Satellite Application Facility on Land Surface Analysis (LSA-SAF) through the EB method utilized the all-weather downward surface shortwave and longwave radiation flux products [33,34]. However, the LSA-SAF product doesn't cover the Asian region and needs to be validated for additional land cover types. Many cloudy-sky LST methods use passive microwave temperature brightness (TB) data [35] because they can penetrate clouds and are strongly correlated with infrared data [36]. However, due to accuracy issues and different spatial resolution, the microwave LST cannot directly fill invalid MODIS LST, especially in complex land surface environments. Recently, data-fusion approaches Remote Sens. 2021, 13, 1671 3 of 27 of TIR LST with land data assimilation system, numerical weather model outputs, or in situ observations have been developed for the reconstruction of cloudy-sky LST [37][38][39]. However further validation in cold and arid regions is needed.
The most promising LST reconstruction algorithm combines more than one of the above methods, such as space-based methods that include auxiliary information [40,41] or spatiotemporal information [30]. Yu et al. assumed that the most similar pixels (SP), whose near-surface thermal environmental factors (composed of topographic factors, vegetation factor, and solar radiation) with high similarity, may follow the same change trend over time [42]. With this approach, cloudy-sky LST is recovered by constructing a transfer function of the most similar pixels at two time periods. In recent research, the SP method is an easy-to-use and effective interpolation method [43]. However, this method relies on manually selecting the reference images from warm and cold days, which may impact the stability and accuracy of the entire algorithm.
This study intends to develop a two-step framework for the reconstruction of cloudysky LST by combining the advantages of the SP method and abundant in situ observation. In situ LST observations from ten automatic weather stations (AWS) during 2013-2018 were used to calculate the cloudiness-induced temperature bias for different land cover types and then to correct the theoretical clear-sky LST. In addition, LST observations performed by all fifteen AWS (including ten AWS for bias correction) during 2013-2018 were used to validate the accuracy of the reconstructed cloudy-sky LST. The new method resolves the instability of the SP method in the reference image selection and maximizes the potential of in situ observations by combining the SP method and a statistical bias correction approach between in situ observations and theoretical clear-sky temperatures.

Study Area
The study area is the Heihe River Basin (HRB) of arid Northwest China, which covers an area of 143,200 km 2 ( Figure 1). In this region, there is a well-known basin-scale integrated observatory network [44]. In the mountainous upstream, there are typical cold region landscapes including glaciers, frozen ground, alpine meadows, and forests [45]; the annual rainfall is 500-800 mm. The midstream area is dominated by an artificial oasis-riparian zone-desert compound landscape; the annual rainfall is 100-250 mm. The downstream area is dominated by desert areas dispersed with little natural oases; the annual rainfall is less than 50 mm [44,46].
The HRB has a marked altitudinal gradient from south to north, ranging from 2640 m to 5000 m upstream, from 1000 m to 3000 m midstream, and from 800 m to 1700 m downstream [47]. Thus, temperature shows zonal characteristics as the altitude changes from the upstream to the downstream and complex terrain. Therefore, the accuracy of the newly developed framework can be thoroughly assessed. Table 1 lists the datasets that were used in this research, including MODIS datasets, topographic data, and ground station radiation measurements by the net radiometer. The variables include LST, emissivity, NDVI, DEM, theoretical radiation, upwelling and downwelling longwave radiation.   Table  2.   Table 2.

Remote Sensing and Auxiliary Data
The Terra satellite was launched on 18 December 1999, and has 10:30 a.m./p.m. equatorial overpass times. It carries the MODIS sensor, which provides high temporal frequency, spatially detailed, and accurate earth observations. In this paper, the Terra MODIS LST Daily L3 Global 1 km SIN Grid product MOD11A1 (10.5067/MODIS/MOD11A1.006) was utilized and daytime and nighttime LST from 2013-2018 were chosen to implement the reconstruction process. The MOD11A1 product is retrieved at the 1 km resolution by the refined generalized split-window algorithm [4]. The quality of raw LST data is evaluated according to quality assurance (QA) metadata. The LST with QA flags of cloud, mean emissivity error >0.04, and LST error larger than 3 K are marked as invalid pixels to be reconstructed. Figure 2 shows the spatial distribution of daytime and nighttime MODIS LST data gaps that need to be recovered during 2013-2018. The missing MODIS LST data in the daytime are substantially more than those during the nighttime; the highest data gaps appear at daytime in winter and at nighttime in summer. Spatially, a larger number of missing data appears upstream at daytime and midstream at nighttime. are marked as invalid pixels to be reconstructed. Figure 2 shows the spatial distribution of daytime and nighttime MODIS LST data gaps that need to be recovered during 2013-2018. The missing MODIS LST data in the daytime are substantially more than those during the nighttime; the highest data gaps appear at daytime in winter and at nighttime in summer. Spatially, a larger number of missing data appears upstream at daytime and midstream at nighttime. The Terra MODIS emissivity Daily L3 Global 6 km SIN Grid product MOD11B1 (10.5067/MODIS/MOD11B1.006) provides the narrowband emissivity, combined with the radiation data observed by ground-based meteorological towers to estimate the in situ land surface temperature.
In the Two-step Improved Similar Pixels Method (TISP) method proposed in this study, there are five environmental factors used to identify similar pixels, including NDVI, theoretical clear-sky radiation, elevation, slope, and aspect. NDVI data collected from the MODIS product MOD13A2 (10.5067/MODIS/MOD13A2.006) aggregated for 16 days were used to indicate the vegetation conditions. The theoretical clear-day solar The Terra MODIS emissivity Daily L3 Global 6 km SIN Grid product MOD11B1 (10.5067/MODIS/MOD11B1.006) provides the narrowband emissivity, combined with the radiation data observed by ground-based meteorological towers to estimate the in situ land surface temperature.
In the Two-step Improved Similar Pixels Method (TISP) method proposed in this study, there are five environmental factors used to identify similar pixels, including NDVI, theoretical clear-sky radiation, elevation, slope, and aspect. NDVI data collected from the MODIS product MOD13A2 (10.5067/MODIS/MOD13A2.006) aggregated for 16 days were used to indicate the vegetation conditions. The theoretical clear-day solar radiation was simply simulated using the solar incident angle, latitude, elevation, slope, aspect, and diffuse and reflected radiation [48]. The slope and aspect, calculated using DEM, were acquired from the Consultative Group for International Agricultural Research, Consortium for Spatial Information SRTM 90 m Database site (http://srtm.CSI.cgiar.org, accessed on 5 March 2021).
The MODIS products were reprojected, mosaicked, and resampled to 1 km (927.66 m) spatial resolution using the MODIS Reprojection Tool (MRT). DEM, slope, aspect data were reprojected and resampled to the same 1 km spatial resolution using OGR/GDAL.

Ground LST Estimation
To evaluate the accuracy of the C6 MOD11A1 product and reconstructed cloudy-sky LST, this study utilized the temperature-based method (T-based) [3], which directly validates the satellite-derived LST by ground-based LST observations at the satellite overpass.
Ground-based measurements were available from fifteen AWS distributed across the mainland surfaces in HRB. The information on the fifteen stations is shown in Table 2. All the AWS, which were established during the HiWATER field campaign from 2013 to 2018 [44,49], are equipped with Kipp & Zonen (CNR1 or CNR4), Eppley (PSP & PIR), or Hukseflux (NR01) net radiometers for observing the surface upwelling and downwelling longwave radiation with 10-min observation temporal resolution. All the stations are subject to rigorous data-quality controls implemented through inter-comparison and calibration of experimental instruments, observatory maintenance, and data processing [47]. The ground-based longwave radiation measurements taken at all AWS were selected to verify the accuracy of the reconstructed LST and MODIS LST. Ten of these measurements were used to calculate the cloud effect temperature term to correct the theoretical clearsky LST.
AWS longwave radiation measurements cannot be used directly to verify LST accuracy and should be transferred into LST data. Based on thermal radiative transfer theory and the Stefan-Boltzmann law, the in situ measured LST T in−situ can be obtained using AWS longwave radiation measurements by the following equation: where F ↑ in−situ and F ↓ in−situ are the in situ observed surface upwelling and downwelling longwave radiation; ε b is the surface broadband emissivity; δ is the Stefan-Boltzmann constant (5.67 × 10 −8 Wm −2 K −4 ). Here ε b for each site was estimated by the MODIS surface narrowband emissivity product via empirical relationships [50]: where ε 29 ,ε 31 and ε 32 are MODIS bands 29, 31, and 32 narrow emissivities, respectively.

MODIS LST Accuracy
The C6 MOD11A1 1-km LST product was assessed by in situ LST measurements at fifteen HiWATER stations during 2013-2018. Several remote sensing LST products were evaluated by these HiWATER stations [51,52].
The MOD11A1 product provides the QA flags that describe quality assurance. According to the QC flags, only the LST with "good quality" (mandatory quality assurance flag = 0) were used to validate the data. Subsequently, in situ LST data were selected based on the overpass time of Terra.
The daytime and nighttime validation results are shown in Figure 3 and Table 5. The biases of MOD11A1 LST are positive for most stations at daytime, ranging from −2.91 to 4.50 K, with an average bias of 1.32 K. The result showed that MOD11A1 overestimated LST at daytime, and the maximum bias of 4.50 K appeared in HMZ. All fifteen stations have root mean square error (RMSE) higher than 2 K, from 2.78 to 5.98 K, with average RMSE of 3.96 K. 4.50 K, with an average bias of 1.32 K. The result showed that MOD11A1 overestimated LST at daytime, and the maximum bias of 4.50 K appeared in HMZ. All fifteen stations have root mean square error (RMSE) higher than 2 K, from 2.78 to 5.98 K, with average RMSE of 3.96 K.  By contrast, MOD11A1 LST biases are negative in most stations at nighttime, ranging from −2.91 to 0.09 K, with an average value of −1.58 K. All fifteen stations showed RMSE ranging from 1.14 to 3.84 K, and lower than 3 K for most stations, with average RMSE of 2.91 K. The validation results revealed that the MOD11A1 LST accuracy is better at nighttime than daytime. There are few outliers in some stations, especially DSL, JYL, and ARC, due to undetected cloud contamination [53]. In this study, the LST of the undetected cloud pixels is not removed because those data have QA = 0. The spatial representativeness of HiWATER stations LST measurement was assessed [54]. For some stations, the validation accuracy varied widely between daytime and nighttime, especially for HMZ, YKZ, HHL, JYL, and HYZ. Because these are nonrepresentative stations, they were more heterogeneous [54].
As shown in Figure 4, the relationship between MODIS LST and in situ LST varies with altitude and season. At daytime, the RMSE between MODIS LST and in situ LST has a non-linear correlation with the altitude of the station ( Figure 4). The RMSE is larger for stations located at relatively lower and higher altitudes (Figure 4a). At nighttime, the RMSE is non-linearly correlated with altitude only in the summer and it is larger at relatively lower and higher altitudes.
to undetected cloud contamination [53]. In this study, the LST of the undetected cloud pixels is not removed because those data have QA=0. The spatial representativeness of HiWATER stations LST measurement was assessed [54]. For some stations, the validation accuracy varied widely between daytime and nighttime, especially for HMZ, YKZ, HHL, JYL, and HYZ. Because these are nonrepresentative stations, they were more heterogeneous [54].
As shown in Figure 4, the relationship between MODIS LST and in situ LST varies with altitude and season. At daytime, the RMSE between MODIS LST and in situ LST has a non-linear correlation with the altitude of the station ( Figure 4). The RMSE is larger for stations located at relatively lower and higher altitudes (Figure 4a). At nighttime, the RMSE is non-linearly correlated with altitude only in the summer and it is larger at relatively lower and higher altitudes.
In general, the RMSE is lower for nighttime than daytime LST (about 1.5 K) and the RMSE is higher in spring (March to May) and summer (June to August) in comparison to the winter months for most stations, which is consistent with a previous study [11]. The maximum RMSE at daytime for the high-altitude stations DSL and JYL occurs in December due to undetected clouds in MODIS LST (Figure 3a).
Since HMZ is a desert with very low vegetation cover (multiyear maximum NDVI 0.054), the RMSE of HMZ is the smallest of all stations and less than 1.5 K in most months during nighttime.

MODIS LST Reconstruction Method
In this study, a Two-step Improved Similar Pixels Method (TISP) framework ( Figure  5) was proposed for generating all-sky MODIS LST based on the idea that cloudy LST In general, the RMSE is lower for nighttime than daytime LST (about 1.5 K) and the RMSE is higher in spring (March to May) and summer (June to August) in comparison to the winter months for most stations, which is consistent with a previous study [11]. The maximum RMSE at daytime for the high-altitude stations DSL and JYL occurs in December due to undetected clouds in MODIS LST (Figure 3a).
Since HMZ is a desert with very low vegetation cover (multiyear maximum NDVI 0.054), the RMSE of HMZ is the smallest of all stations and less than 1.5 K in most months during nighttime.

MODIS LST Reconstruction Method
In this study, a Two-step Improved Similar Pixels Method (TISP) framework ( Figure 5) was proposed for generating all-sky MODIS LST based on the idea that cloudy LST could be derived by the theoretical clear-sky LST and the cloud effect temperature term. First, the theoretical clear-sky LST was reconstructed by the improved SP method [42] in Step I. Second, the systematic bias of the theoretical clear-sky LST estimates was corrected by the could be derived by the theoretical clear-sky LST and the cloud effect temperature term. First, the theoretical clear-sky LST was reconstructed by the improved SP method [42] in Step I. Second, the systematic bias of the theoretical clear-sky LST estimates was corrected by the cloud effect on temperatures of different landcover types using statistical relations between the in situ observations and theoretical clear-sky LST. Step I: selection of similar pixels (green pixels) and estimate Tck for pixel, i; Step II: Obtaining the cloud effect temperature term (ΔTs), and bias correction of Tck to obtain the actual cloudy-sky LST (Tcd).

Reconstruction of Theoretical Clear-Sky LST
The complete theoretical clear-sky LST generation process is shown in Figure 5 Step I, and the detailed steps are as follows: (1) Marking invalid pixels. The quality of the original LST data was evaluated based on quality assurance (QA) metadata. LST with QA flags of cloud, mean emissivity error >0.04, and LST error larger than 3 K were marked as invalid pixels and need to be reconstructed.
(2) Generating reference images: In the original SP method, two reference images were manually selected to represent the cold and warm seasons. Since cloud morphology Step I: selection of similar pixels (green pixels) and estimate T ck for pixel, i; Step II: Obtaining the cloud effect temperature term (∆T s ), and bias correction of T ck to obtain the actual cloudy-sky LST (T cd ).

Reconstruction of Theoretical Clear-Sky LST
The complete theoretical clear-sky LST generation process is shown in Figure 5 Step I, and the detailed steps are as follows: (1) Marking invalid pixels. The quality of the original LST data was evaluated based on quality assurance (QA) metadata. LST with QA flags of cloud, mean emissivity error >0.04, and LST error larger than 3 K were marked as invalid pixels and need to be reconstructed.
(2) Generating reference images: In the original SP method, two reference images were manually selected to represent the cold and warm seasons. Since cloud morphology and the land surface can change over a long period of time, this manual selection of the reference image may introduce large errors. In this study, the reference image selection of the TISP method was improved. The image set was generated by calculating the moving average of five consecutive MOD11A1 LST images, where the average image with a data gap rate of less than 5% was selected as an element of the reference image set and interpolated with the conventional Kriging algorithm. For each image to be reconstructed, the temporally closest reference image was automatically selected.
(3) Locating similar pixels and determining the transfer function.
(a) Selecting impact factors. In Step I ( Figure 5), the basic step of the whole TISP method process was to select pixels similar to those that need to be reconstructed, which is based on the synergistic influence of cloudy-sky pixels and similar pixels having comparable near-surface thermal states as well as geographical position, topography, and land cover. The general factors of local topography include altitude, slope, aspect, and hill shade. The effect of altitude can be expressed in terms of lapse rate (decrease in temperature with altitude) [55]. In the mean free atmosphere, the temperature lapse rate is in the range of 0.60-0.70 K/100 m [56]. For LST, the lapse rate varies by region, e.g., in the Himalayan region lapse rate is in the range 0.60-0.74 K/100 m [57], while in Mt. Taibai it is only 0.34-0.50 K/100 m [58]. Slope and aspect affect LST through solar radiation and heat load [59]. Due to the effect of illumination, LST differs significantly between the South and North aspects [60]. LST is negatively correlated with slope [61] and the shading of a hill also impacts LST [62,63]. Only altitude, slope, and aspect were used to characterize the topography in this paper. Land cover can also affect LST by changing the surface thermal properties. Here, NDVI represents land cover because NDVI and LST are well correlated [64][65][66]. Therefore NDVI, DEM, slope, aspect, radiation were selected as the impact factors in this study.
(b) Normalizing impact factors. Since impact factors have different data ranges and units, the data were normalized using the following function: where k i is the impact factor value corresponding to each pixel i; S i k is the normalized impact factor k i , k max and k min are the maximum and minimum values of the impact factor in the whole image or study area, respectively. They are the spatial statistical values in the study area.
(c) Selecting similar pixels by discriminant function. The most similar pixels set D of a cloudy-sky pixel are identified by the similar threshold L using these normalized impact factors for the reconstructed and corresponding reference image: where S is the factors set that affect the LST of cloudy pixel i, S n is the nth impact factor; S I is all attributes of cloudy pixel i and valid pixels in the reconstructed image; S J is all the attributes of valid pixels j in the same region of the reference image; and d is the similarity function, which measures the similarity of the other valid pixels to the cloudy pixel i. In order to a simple implementation, d takes the form of Euclidean distance. Then, the similar pixels in set D are the desired pixels with a distance less than a given threshold L (in this study, L = 0.3). To obtain robust statistics of similar pixels in two images, it is essential to filter MODIS product outliers caused by undetected clouds [53]. In this study, the "3σ-Hampel identifier" method was used to remove a relatively small number of outliers [67].
(d) Linear regression using the transfer function.
Supposing that the pixels in set D in the reconstructed time t rec and reference image time t re f have a similar change trend, a transfer function (f ) was regressed by the similar pixels D pairs at t rec and t re f . Then, the theoretical clear-sky LST (T i ck ) of the reconstructed pixel at time, t rec , in the reconstructed image T t rec can be calculated by the f relying on the corresponding LST data (T i t re f ) in the reference image, and the function f can be the simple linear function: where a is the linear regression slope and b is the linear regression intercept.

Bias Correction of Theoretical Clear-Sky LST
In this step, we assume that the actual cloudy-sky LST T cd has the two parts, including the theoretical clear-sky LST (T ck ) and the cloud affect temperature term (∆T s ): T ck can be calculated in Step I (Figure 4). We assume that the cloud-affected temperature part (∆T s ) is related to time, NDVI, and geography. It is obtained through the statistic of T ck and in situ LST, according to the following bias correction steps: (1) All the pixels in T ck and sites are classified into four types based on the yearly max (2) Monthly ∆T s is approximately equal to the mean bias error of each site type LST observation with T ck monthly at day and night, respectively: where ∆T t,m s is the cloud effect temperature at land cover type t in month m (generally, clouds decreased LST during the day and increased LST at night), j is the landcover type of each site, k is the number of the site's land cover type t, i is the cloudy pixel during month m, n is the number of T ck during each month m (n ≤ 31), T i,j ck is the theoretical LST during day i and site j with landcover type m. T i,j in−situ is the in situ LST at day i and site j.
(3) The cloudy-sky LST before variance scaling T t,m cd−un with land cover type, t, and month, m, can be calculated with the following bias correction equation: where T t,m ck is the theoretical clear-sky LST with type t and month m. (4) Variance scaling In order to minimize the systematic biases among reconstructed cloudy temperature and MODIS LST, the variance scaling method [37,68] was utilized: The final actual cloudy-sky LST T t,m cd with land cover type t and month m can be calculated by Equation (11), the standard deviation (σ) of T t,m cd−un is scaled according to the proportion of MODIS LST monthly σ(T t,m modis ) and σ(T t,m cd−un ) at month m and then the variance-corrected LST data were shifted back using the mean value (µ) of T t,m cd−un .

Evaluation Metrics
The metrics for performance evaluation of the algorithm included mean bias error (MBE), root mean square error (RMSE), and coefficient of determination (R 2 ). The evaluation metrics are defined as follows: where n is the in situ observation number; T i rec and T i in−situ are the ith reconstructed and in situ measured LST; T rec and T in−situ are average values of the reconstructed and in situ observed LST, respectively.

Results and Discussion
The estimated theoretical clear-sky LST indicates that clouds cool the land surface during the daytime and warms it at nighttime (Terra overpass times, Figure 6). The response amplitude and trends vary by land surface types (Figure 7). The TISP method reconstructed cloudy-sky LST was validated using the in situ LST at fifteen stations. Figure 8 and Table 5 show the verified results for daytime and nighttime.
For the MBE, RMSE, and R 2 between the reconstructed cloudy-sky LST and ground measurement LST, the mean MBE of 0.15 K, mean RMSE of 2.91 K, and mean R 2 of 0.93 at nighttime are better than daytime (mean MBE = −0.43 K, mean RMSE = 4.41 K, and mean R 2 = 0.90). The accuracy is lower in the sparsely vegetated shrub and tree type in the arid region in the daytime.

Evaluation Metrics
The metrics for performance evaluation of the algorithm included mean bias error (MBE), root mean square error (RMSE), and coefficient of determination (R 2 ). The evaluation metrics are defined as follows:

Results and Discussion
The estimated theoretical clear-sky LST indicates that clouds cool the land surface during the daytime and warms it at nighttime (Terra overpass times, Figure 6). The response amplitude and trends vary by land surface types (Figure 7). The TISP method reconstructed cloudy-sky LST was validated using the in situ LST at fifteen stations. Figure  8 and Table 5 show the verified results for daytime and nighttime.     For the MBE, RMSE, and R 2 between the reconstructed cloudy-sky LST and ground measurement LST, the mean MBE of 0.15 K, mean RMSE of 2.91 K, and mean R 2 of 0.93 at

Generating the Theoretical Clear-Sky LST
The missing data covered by clouds were estimated using the TISP method in Step I ( Figure 5) to generate theoretical clear-sky LST (T ck ). T ck and the in situ LST (T in situ ) on cloudy days were compared across the ten stations ( Figure 6). Because the T in situ measurements compared with the T ck represent the cloudy-sky condition, the fitted line of the T ck data at daytime is situated above the 1:1 line (Figure 6a) owing to the cloud cooling effect by decreasing downwelling shortwave radiation. In contrast, the T ck is lower than the T in situ because of the warming effect due to the cloud increased downwelling longwave radiation at nighttime.
The accuracy evaluation statistical metrics for T ck are shown in Table 3, which are similar to those shown in Figure 6. The differences between the T ck and T in situ for cloudy skies vary with time. The comparison of T ck and T in situ on cloudy days results in mean MBE of 2.76 K, mean RMSE of 5.16 K, and mean R 2 of 0.90 at daytime; mean MBE of −4.69 K, mean RMSE of 5.67 K, and mean R 2 of 0.91 at nighttime. There is higher R 2 and MBE between the T ck and T in situ at nighttime than in the daytime. The above results show that, although it is practicable to estimate the T ck for cloudysky pixels using the TISP method, there is still a systematic bias in the T ck . Furthermore, the systematic bias between T ck and T in situ is associated with different land cover types [37], which could be a result of how clouds influence the temperature of different surfaces. Therefore, it is essential to quantitatively analyze the cloud effect on LST for different land covers located in different regions. In this research, we classified the land covers into four types according to the summer mean NDVI. According to this classification scheme, ARC   (Figure 7d).
At nighttime, most stations show a net warming effect of clouds, high warming amplitude in spring, moderate in summer, and low in winter. The temperature affected by cloud at most stations showed a monthly "W" shape trend caused by similar cloud cover (Figure 7e-h); the max warming amplitude of about 7 K appears in spring at the sparse vegetation and bare land sites (Figure 7g); ARC shows a different trend with a warming amplitude of 5-7 K, a max warming amplitude of 7 K also appears in spring and winter (Figure 7e). Figure 7c,g show the daytime and nighttime cloud effects temperature amplitude at sparse vegetation stations. They have the same seasonal trend in daytime and nighttime, but HHL has a stronger cooling effect than SDQ in the daytime. The sparse vegetation stations have a max cooling amplitude of about 5 K in summer and a warming amplitude of about 8 K in spring. Figure 7d and h show the daytime and nighttime temperature changes caused by cloud cover at bare land stations (HZZ, JCHM midstream, HMZ downstream). These stations have the same trend of cloud effect amplitude on the temperature in the daytime and nighttime. The max cooling amplitude of about 8 K appears in summer and warming amplitude of about 6 K in spring.
Most of the subfigures in Figure 7 show the same trend and similar amplitude. The yearly cloud effect temperature statistics are shown in Table 4. At bare land sites, located in the middle and downstream, the temperature change trends caused by cloud cover are seasonal and very similar between stations, with a cooling amplitude of 4.14 K in the daytime and warming of 3.99 K at nighttime. For dense vegetation, the temperature change trends have a small seasonality and are very similar across stations during the daytime. Therefore, LST can be corrected for cloud effects in each pixel by using the bias correction of stations grouped by vegetation density class.

Validation
To verify the accuracy of the new TISP framework, the corrected actual cloudy-sky LST (T cd ) was validated with the data from the fifteen stations during 2013-2018. To minimize the biases caused by time mismatch, only the in situ LST observations matching the Terra overpass times within the closest half hour were chosen.
The validation scatters density plots are shown in Figure 8a (daytime) and Figure 8b (nighttime). The plots indicate that most of the scatter points between T cd and T in situ are concentrated around the 1:1 line. Moreover, the T cd at nighttime shows greater agreement with the in situ LST measurement than at daytime. Table 5 shows the statistical metrics (MBE, RMSE, and R 2 ) for each station. The mean statistical metrics for all stations, ten bias correction stations, and five independent stations are shown in Table 6. The validation accuracy of all stations indicates that the T cd have mean MBE <0.50 K (−0.43 K at daytime, 0.15 K at nighttime), mean RMSE of 4.41 K, and R 2 of 0.90 at daytime, and have mean RMSE of 2.91 K, and R 2 of 0.93 at nighttime. The accuracy of five independent stations have mean MBE of −1.32 K, mean RMSE of 4.86 K, and R 2 of 0.88 at daytime, and have mean MBE of 0.26 K, mean RMSE of 2.79 K, and R 2 of 0.92 at nighttime, slightly lower than the result of all stations.   Figure 8a compares T cd against T in situ in the daytime. Overall, most of the T cd show an excellent agreement with the T in situ . The MBE of all stations is between ±1.5 K. For five independent stations, MBE ranged from −3.56 K to −0.08 K and R 2 ranged from 0.84 to 0.93. The accuracy of the T cd at the sparse vegetation stations (HHL, SDQ, and HYZ) are lower than others. This may be due to three reasons: (1) The MODIS LST product has poor accuracy in arid areas and few outliers with large errors due to undetected clouds. It may cause similar pixels generated by the TISP method to have large uncertainty, which directly impacts the accuracy of T ck . (2) There is uncertainty in the cloud effect temperature term estimation in these two stations. Although they belong to the sparse vegetation type (the stations have similar NDVI), the land surface of HHL and HYZ is mixed with sand, Populus euphratica, and Tamarix, thus the tree shadows increase the heterogeneity in the regional temperature [62], while SDQ is mixed with sand and Tamarix, resulting in a more homogeneous land surface than HHL. This directly impacts the accuracy of the cloud effect temperature of this land cover type (Figure 7c). (3) These three stations are very close to each other (Figure 1), resulting in a big bias from large viewing angles. Figure 8b is the validation result at nighttime. For all stations, MBE ranges from −0.37 to 0.92 K, with an average MBE of 0.15 K; RMSE ranges from 2.49 to 3.47 K, with an average RMSE of 2.91 K; R 2 ranges from 0.91 to 0.96 (Table 5), with a mean R 2 of 0.93. Figure 3 and Table 5 show the evaluation of MODIS LST for daytime and nighttime. The validation accuracy of the MODIS LST product has an average MBE of 1.32 K (−1.58 K), average RMSE of 3.96 K (2.64 K), and average R 2 of 0.93 (0.95) at daytime (nighttime).
Overall, the accuracy of the TISP method at all stations is generally comparable to the accuracy of the valid MODIS data. The corrected T cd at most stations has better agreement with the T in situ at nighttime than daytime. The result shows that the TISP method is dependable for most land cover types in HRB. The in situ LST observation at Terra overpass time in the cloudy sky varied from 250 K (240 K) to 330 K (320 K) during 2013-2018. This temperature range can represent HRB's temperature in cloudy skies. This means that the TISP method is reliable for basin-wide applications in the HRB.

An Experiment for Testing the Accuracy of the T ck
In order to verify the accuracy of the T ck , a numerical simulation experiment was designed for clear-sky MODIS LST images. For the experiment, two images of daytime and nighttime LST (data gaps <5%) from the MOD11A1 LST product were selected for day 205 of 2018. Then, the original MODIS LSTs were masked with four circular areas with diameters of 50, 100, 150, and 200 km in three experiment regions (Figure 9). The four black circles with diameters (D) of 50, 100, 150, 200 km centered on each red dot represent the simulated cloud cover area, which was then reconstructed. The three typical experiment areas are located in a relatively dense vegetation cover area upstream (upstream test), medium vegetation cover region midstream (midstream test), and low vegetation cover area downstream of HRB (downstream test). Upstream test is the most complicated site because it has relatively less vegetation heterogeneity but large elevation changes. Overall, the accuracy of the TISP method at all stations is generally comparable to the accuracy of the valid MODIS data. The corrected Tcd at most stations has better agreement with the Tin situ at nighttime than daytime. The result shows that the TISP method is dependable for most land cover types in HRB. The in situ LST observation at Terra overpass time in the cloudy sky varied from 250 K (240 K) to 330 K (320 K) during 2013-2018. This temperature range can represent HRB's temperature in cloudy skies. This means that the TISP method is reliable for basin-wide applications in the HRB.

An Experiment for Testing the Accuracy of the Tck
In order to verify the accuracy of the Tck, a numerical simulation experiment was designed for clear-sky MODIS LST images. For the experiment, two images of daytime and nighttime LST (data gaps <5%) from the MOD11A1 LST product were selected for day 205 of 2018. Then, the original MODIS LSTs were masked with four circular areas with diameters of 50, 100, 150, and 200 km in three experiment regions (Figure 9). The four black circles with diameters (D) of 50, 100, 150, 200 km centered on each red dot represent the simulated cloud cover area, which was then reconstructed. The three typical experiment areas are located in a relatively dense vegetation cover area upstream (upstream test), medium vegetation cover region midstream (midstream test), and low vegetation cover area downstream of HRB (downstream test). Upstream test is the most complicated site because it has relatively less vegetation heterogeneity but large elevation changes.  The simulated cloud cover areas in the circles with different diameters were reconstructed using the TISP method, and the difference between the reconstructed T ck and MODIS LST was calculated at each experiment area (Figures 10 and 11). The result in-dicates that the TISP method has better performance in the upstream and midstream and the difference image was very close to 0 K ( Figure 12). The downstream test result (Figures 10c and 11c) shows that the TISP method may overestimate at daytime and underestimate at nighttime. Therefore, the TISP method is more accurate in areas of dense vegetation than bare land, because MODIS LST has lower accuracy in bare land, especially in the daytime. At nighttime, all regions have similar accuracy and better than daytime because the land surface becomes more homogeneous during nighttime. For each test region, the difference between MODIS LST and T ck displays similar results among different diameters.
To quantitatively analyze the accuracy of the TISP method for T ck generation, the MBE and standard deviation (STD) between MODIS LST and T ck were calculated ( Figure 12). Most MBE of the T ck is concentrated in ±1 K. The range of MBE in the daytime is slightly decreased from 0.79 K to −0.10 K upstream, increased from −0.01 to 0.94 K midstream, decreased from −3.53 K to −2.65 K downstream. At nighttime, the range of MBE is smaller and more stable than daytime, ranging from −0.71 to −0.63 K upstream, from −0.33 to −0.18 K midstream, from 0.63 to 0.76 K downstream.
The average STD for each test area is less than 3 K in the daytime and less than 1.5 K at nighttime (Figure 12). At daytime, STD ranges from 2.47 to 2.71 K upstream, from 1.69 to 1.96 K midstream, and from 0.83 K to 1.87 K downstream. The STD remains relatively stable in the three experiment regions with D = 50 km to 150 km. At nighttime, STD changes from 1.12 K to 1.23 K upstream, from 1.04 K to 1.19 K midstream, and from 0.63 K to 0.91 K downstream. The STDs at nighttime are more stable than at daytime.
In most experiments, the accuracy of the TISP method is similar in different simulated cloud cover areas because, for a certain area, the regressed function derived from similar pixels between the reconstructed and reference image are relatively stable. Moreover, Figures 10-12 indicate that the downstream test has the greatest MBE among all tests, while the upstream and midstream tests show smaller LST bias. For the cloud cover area, the spatial structure of the regional LST distribution has a greater influence on the TISP method than the area of cloud cover area.
To test the topographic impact of the TISP method, the simulated cloud cover area with 200 km diameter upstream was selected to compare the MBE with the DEM and NDVI range at daytime and nighttime. The distribution histograms of MBE for the difference image (original MODIS LST minus T ck ) are shown in Figure 13a,d for day and night, respectively. The MBE distribution histogram shows that the average biases of most test areas are close to 0 K and are concentrated in ±1 K, although there were few larger bias values. The percentage of bias in the range ±1 K is 54.6% during daytime and 72.4% at nighttime. Therefore, the TISP method yields slightly higher accuracy with nighttime than daytime data. Figure 13b,e show the MBE within different classes of DEM in daytime and nighttime. From Figure 13b,e, the MBE is higher at high elevations (>4200 m), but the percent is less than 5%. Figure 13c,f shows the MBE within different classes of NDVI. From Figure 13c,f, the MBE is higher in the lower NDVI (<0.1), but the percent is <5%. Therefore, judging from the overall accuracy and the effect of the DEM and NDVI, the TISP reconstruction method is sensitive to the DEM factor at high elevations and the NDVI factor at lower NDVI values at daytime.

Research Limitations
The TISP method proposed in this paper provides a new framework that takes full advantage of ground-based observation. While it shows good accuracy, TISP also still presents some limitations.
The impact factors used to select similar pixels in TISP include DEM, slope, aspect, NDVI, and theoretical radiation. Although NDVI represents, to some extent, the state of the land cover, it may cause a similar pixel selection algorithm to misclassify low NDVI areas. Bare land has the largest area in HRB and is widely distributed, including bare rock, the Gobi Desert, sand desert, saline land, etc. It is difficult to discriminate among these bare land types due to their low NDVI values; therefore, to improve the accuracy of the

Research Limitations
The TISP method proposed in this paper provides a new framework that takes full advantage of ground-based observation. While it shows good accuracy, TISP also still presents some limitations.
The impact factors used to select similar pixels in TISP include DEM, slope, aspect, NDVI, and theoretical radiation. Although NDVI represents, to some extent, the state of the land cover, it may cause a similar pixel selection algorithm to misclassify low NDVI areas. Bare land has the largest area in HRB and is widely distributed, including bare rock, the Gobi Desert, sand desert, saline land, etc. It is difficult to discriminate among these bare land types due to their low NDVI values; therefore, to improve the accuracy of the

Research Limitations
The TISP method proposed in this paper provides a new framework that takes full advantage of ground-based observation. While it shows good accuracy, TISP also still presents some limitations.
The impact factors used to select similar pixels in TISP include DEM, slope, aspect, NDVI, and theoretical radiation. Although NDVI represents, to some extent, the state of the land cover, it may cause a similar pixel selection algorithm to misclassify low NDVI areas. Bare land has the largest area in HRB and is widely distributed, including bare rock, the Gobi Desert, sand desert, saline land, etc. It is difficult to discriminate among these bare land types due to their low NDVI values; therefore, to improve the accuracy of the TISP method on bare land, we need to further develop a long time series land cover product that includes more refined land cover types.
The original MODIS LST data presents some anomalies (Figure 3a) caused by undetected clouds [53,69]. Although some of the anomalies can be eliminated through quality control, they cannot be completely eliminated, affecting reference image generation and model construction. Therefore, more attention needs to be paid to identifying cloudcontaminated pixels in the MODIS LST product before performing LST reconstruction.
On the other hand, two empirical parameters may affect the accuracy of the TISP method, one is the number of consecutive days, and another is the data gaps rate threshold. Setting a larger number of consecutive days can increase the amount of valid data in the image. However, it can also lead to a weakening of the correlation between similar pixels in the reference image. Too small of a data gap rate threshold may result in a small number of reference images, consequently the large interval between the reference image and reconstructed image.
The impact of cloud type on LST for different land covers needs to be studied further. In this paper, the theoretical clear-sky LST was compared with ground-based observations and found that the effect of clouds on LST varies seasonally and with land cover. At night, the warming effect of clouds on LST is higher in spring and autumn under most vegetation types, which is mainly due to the different cloud types in these seasons. The spatial distribution and temporal variation characteristics of cloud types in HRB were not discussed in depth in this paper. Research on the influence of cloud types on LST should be strengthened in future studies.
For regions with adequate and typical ground-based observations, the TISP method is applicable to other thermal remote sensing sensors with short satellite revisit cycles, which can provide a sufficient number of validation data required for bias correction.
The TISP bias correction requires in situ LST measurements to construct the cloud effect temperature term. Therefore, it cannot be directly applied to regions without in situ measurements. However, the bias correction can be improved by calculating the cloud effect temperature term from the comparison between calibrated land skin temperature of reanalysis data [39] against the theoretical clear sky LST.
In recent years, all-weather LST generation methods have evolved from single data source to multi-source data fusions. Future research will focus on combining the advantages of TIR and microwave LST from polar and geostationary satellites, ground-based observations, and model output data to derive a global, gap-free LST product.

Conclusions
In this research, we proposed the TISP method to reconstruct actual cloudy-sky LST for MOD11A1 data. First, the theoretical clear-sky land surface temperature was estimated based on the improved similar pixel algorithm. Then, bias correction for the theoretical clear-sky land surface temperature used the cloud effect temperature component calculated from the statistical relation between in situ LST measurements and theoretical clear-sky land surface temperature.
The estimated theoretical clear-sky temperature indicated that clouds cool/warm the land surface during day/night in HRB, China. For bare land located mid-and downstream, the temperature change trends caused by cloud cover have obvious seasonality and similarity; cloudiness causes an approximate reduction of the yearly mean value of 4.14 K, ranging from 3.18 to 5.12 K during the day, while an increase of the yearly mean value of 3.99 K, ranging from 3.38 to 4.55 K at night on a monthly average was observed. For dense vegetation, temperature change trends have small seasonality and are very similar across stations during the day.
This TISP method and MODIS LST product had been validated against in situ observation of LST at 15 stations during 2013-2018, and the accuracy assessment of the reconstructed actual cloudy-sky LST at daytime shows that MBEs are between ±1.5 K, except for one station of −3.56 K, RMSEs of 3.40-5.66 K, and R 2 of 0.83-0.95; at nighttime, the accuracy shows that MBEs are between ±1.0 K, RMSEs of 2.57-3.47 K, and R 2 of 0.90-0.96. Overall, the accuracy assessment of the cloudy-sky LST with results in mean MBE of −0.43/0.15 K, mean RMSE of 4.41/2.91 K, and R 2 of 0.90/0.93 at daytime/nighttime, which is consistent with the recently published studies [37,41] and MODIS LST product accuracy with a mean MBE of 1.32/−1.58 K, and a mean RMSE of 3.96/2.64 K.
Overall, the developed method maximizes the potential use of MODIS LST data over cloud cover effect on temperature with in situ observations, and can successfully reconstruct cloudy pixels under all-sky conditions. The validation and experimental results show that for most land surface types the TISP method has high accuracy, except over sparse vegetation types. Although there is still uncertainty in the TISP method, this study proposed an effective and very simple guideline for MODIS cloudy-sky LST reconstruction at the basin scale where abundant ground-based measurements for most land surface types are available. In a future study, the new fusion framework will be developed to generate a long time series of LST datasets by taking advantage of additional thermal infrared and microwave remote sensing data, model outputs, and in situ observation.