A Random Forest-Based Data Fusion Method for Obtaining All-Weather Land Surface Temperature with High Spatial Resolution

: Land surface temperature (LST) is an important parameter for mirroring the water–heat exchange and balance on the Earth’s surface. Passive microwave (PMW) LST can make up for the lack of thermal infrared (TIR) LST caused by cloud contamination, but its resolution is relatively low. In this study, we developed a TIR and PWM LST fusion method on based the random forest (RF) machine learning algorithm to obtain the all-weather LST with high spatial resolution. Since LST is closely related to land cover (LC) types, terrain, vegetation conditions, moisture condition, and solar radiation, these variables were selected as candidate auxiliary variables to establish the best model to obtain the fusion results of mainland China during 2010. In general, the fusion LST had higher spatial integrity than the MODIS LST and higher accuracy than downscaled AMSR-E LST. Additionally, the magnitude of LST data in the fusion results was consistent with the general spatiotemporal variations of LST. Compared with in situ observations, the RMSE of clear-sky fused LST and cloudy-sky fused LST were 2.12–4.50 K and 3.45–4.89 K, respectively. Combining the RF method and the DINEOF method, a complete all-weather LST with a spatial resolution of 0.01 ◦ can be obtained.


Introduction
Land surface temperature (LST) is an important indicator of energy balance and material exchange on the surface of the Earth, and has been widely used in many fields [1][2][3][4][5]. With the advancement of remote sensing technology and the stimulus of strong application demands, the number of Earth observation satellites has increased rapidly, producing a massive amount of satellite data [5]. Various advanced LST products can be generated from these satellite data [6]. However, due to cloud contamination, and defects of the retrieval algorithms, advanced remote sensing (RS) products derived from single sensors are suspected to have spatial incompleteness, temporal discontinuity, and inconsistent physical meanings [7]. In contrast, the spatial integrity and data quality of the same products derived from multisensory observations may be complementary. For example, satellite-retrieved LST includes thermal infrared (TIR) LST and passive microwave (PMW) LST. The TIR LST data have high spatial resolution (e.g., 1 km for Moderate Resolution Imaging Spectroradiometer (MODIS) LST) and high retrieval accuracy (approximately 0.3-2 K), but there are many missing values in the data due to clouds [8][9][10][11][12][13]. PMW radiation can penetrate clouds, but PMW LST data have a relatively low spatial resolution (e.g., Frequent cloud coverage in China has limited the application of TIR LST in this region. Therefore, mainland China was chosen as the study area, and its location is shown in Figure 1. Its background is the true color image. The terrain of China is highly complex, and its ecosystems range from glaciers and deserts to grasslands, wetlands, tropical rain forests, lakes, and oceans, which leads to large spatial temperature differences within its territory [38]. Furthermore, its climate is mainly wet monsoon and dry seasons, which leads to drastic temperature changes between seasons [39]. Two verification regions were chosen in the Tibetan Plateau (TP) region and the Heihe River Basin (HRB) region, respectively. The verification region in the TP is located near the city of Naqu. The elevation range of this verification region ranges from 2752 m to 6994 m and the average slope is 5.26 • . Its land cover (LC) is mainly savanna and grassland. The verification region in the HRB is located on the border of Qinghai Province and Gansu Province. Its surface elevation is 1056-5314 m. The average slope of the Qilian Mountains in the southwest and the plains in the northeast is 7.27 • and 1.03 • , respectively. Its LC types include cropland, forest, sparse grassland, and barren land. The locations of the TP and HRB regions are indicated by two red rectangles in Figure 1, and the locations of in situ measurement sites are marked in Figure 1a,b with green circle symbols.

Data
Daily AMSR-E brightness temperature (BT) and MODIS LST were used as the main data. To obtain the PMW LST data required for the RF fusion process, LC data, snow cover data, elevation data, desert distribution data, and normalized difference vegetation index (NDVI) data were required in the selected PMW LST generation method [40]. As LST is regulated by LC type, terrain, and vegetation, the data corresponding to these factors are needed during the RF fusion process [41]. In addition, longitude and latitude belong to spatial information and can reflect the moisture condition from coastal to inland areas, while latitude maps can reflect the difference in solar radiation [42]. The study by Hengl et al. [43] proved that considering the latitude and longitude when using the ML algorithm can strengthen the spatial interaction in the training process of the trees and improve spatial nonstationarity. The downward shortwave radiation (DSR) represents the difference in solar radiation at different latitudes, so DSR data can also reflect the difference in spatial position to a certain extent. Therefore, the data used in the RF fusion process also included longitude, latitude, and DSR data. In addition, the in situ data from the HRB and TP regions were used as reference data to verify the accuracy of the selected model.

Satellite Data
AMSR-E is a microwave sensor on board an Aqua satellite. The AMSR-E BT data were obtained from the National Snow & Ice Data Center (NSIDC) (https://nsidc.org/) (Last accessed on 4 June 2021). The data included the BT data for six different frequencies (6.9, 10.7, 18.7, 23.8, 36.5, and 89.0 GHz) in two polarization channels (horizontal and vertical polarization).
MODIS is an important sensor onboard the Terra and Aqua satellites. The AMSR-E and MODIS sensors on the Aqua satellite observe the Earth's surface simultaneously, at approximately 1:30 p.m. local time in the daytime and 1:30 a.m. local time at night. The MODIS data were provided by the Level-1 and Atmosphere Archive & Distribution System (LAADS) Distributed Active Archive Center (DAAC) (https://ladsweb.modaps. eosdis.nasa.gov/) (Last accessed on 4 June 2021), which contains MODIS LST products, LC products, snow products, and NDVI products. The MODIS LST product (MYD11A1) was derived from two MODIS thermal infrared channels (31 and 32) using the generalized split-window algorithm [44], which contains daily daytime and nighttime LST, quality control (QC), and transit time information. The QC information was used to identify highquality MODIS LST pixels. LST pixels that were displayed as "LST produced", "good data quality", "average emissivity error ≤ 0.02", and "average LST error ≤ 1 K" in the QC layer were considered high-quality pixels and used in this study. The transit time information for each pixel was used to match the pixel with the in situ data. In addition, the spatial information and latitude and longitude data can also be acquired from this LST dataset. The LC data generated according to the International Geosphere-Biosphere Programme (IGBP) classification system in the MODIS LC product (MCD12Q1) was used for this study. This LC type data were used for the PMW LST generation and RF fusion process. In the RF fusion process, LC type data were simply synthesized into 5 types, namely soil, vegetation, water, ice and snow, and buildings. The MODIS snow cover product (MYD10C1), which represents the percentage of snow area in the entire grid area, provides the snow data required by the PMW LST generation process. The NDVI data provided by the MODIS vegetation index product (MYD13A2) product were used in the PMW LST generation process, and also used in the RF fusion process as vegetation-related information.
The elevation data was the Shuttle Radar Topography Mission (SRTM) dataset, which is a global elevation dataset collected by the radar onboard the space shuttle Endeavour in February 2000. The data were downloaded from http://srtm.csi.cgiar.org/srtmdata/ (Last accessed on 4 June 2021). These elevation data and the slope data generated based on these elevation data were used for the PMW LST generation process.
The desert distribution data required during the PMW LST generation process were the China desert distribution vector data and were downloaded from the Cold and Arid Region Science Data Center (http://westdc.westgis.ac.cn) (Last accessed on 12 October 2018). The data needed to be converted into raster data for use [45].
The DSR data were obtained from the Global Land Surface Satellite (GLASS) products. The GLASS DSR products are generated from the data of multiple polar-orbit satellites (MODIS) and geostationary satellites (Geostationary Operational Environmental Satellite (GOES) imager; Meteosat Second Generation (MSG) SEVIRI; Multi-functional Transport Satellite (MTSAT)-1R imager) using an improved look-up table (LUT) method by radiative simulation based on MODTRAN [46,47]. The DSR data were downloaded from the National Earth System Science Data Center, National Science & Technology Infrastructure of China (http://www.geodata.cn) (Last accessed on 4 June 2021), and were used in the RF fusion process.
The basic information of the datasets used in this study is shown in Table 1. The study period ranges from 1 January 2010 to 31 December 2010. The MYD11A1, MCD12Q1, MYD10C1, MYD13A2, SRTM DEM, map of the desert distribution of China, and GLASS DSR were uniformly converted into geographical latitude/longitude coordinates and resampled to a spatial resolution of 0.01 • .

In Situ Measurements
In order to evaluate the fused LST, a land-atmosphere interaction observations dataset from the TP region [48], and the Automatic Weather Stations dataset (AWS) from the Heihe region [49] were used.
The land-atmosphere interaction observations dataset was downloaded from the National Tibetan Plateau Data Center (https://doi.org/10.11888/Meteoro.tpdc.270910) (Last accessed on 4 June 2021) [48]. It includes the four-component radiation, multi-layer soil temperature, humidity and soil heat flux, and other observations. In this study, the data from the BJ site of Nagqu Station of Plateau Climate and Environment and Nam Co Monitoring and Research Station (NAMORS) for Multisphere Interactions were selected as the verification data. The LC types of BJ and NAMORS sites are alpine meadow and alpine steppe, respectively.
The AWS dataset was obtained from Watershed Allied Telemetry Experimental Research (WATER), which was provided by the Heihe Plan Data Management Center (http://www.heihedata.org/) (Last accessed on 12 October 2018) [49]. The observation items included the four-component radiation, the multi-layer soil temperature, soil moisture, soil heat flux, and other observation. The data from the Arou (AR) and Yingke (YK) sites were selected as the verification data in the study. The LC types of AR and YK sites are alpine meadow and cropland, respectively. The basic information about these verification sites is presented in Table 2. The locations of these sites are marked with green circle symbols in Figure 1a,b. The radiation data, including the surface longwave upwelling and downward radiation, were used in the verification process to calculate in situ temperature data, and the temperature data were calculated by using the Stefan-Boltzmann law, as shown in the following equation: In the equation, T s is the calculated in situ temperature data; F ↑ and F ↓ are the surface longwave upwelling radiation and the surface longwave downward radiation, respectively; σ is the Stefan-Boltzmann constant (5.67 × 10 −8 Wm −2 K −4 ); and ε bb is the surface broadband emissivity, which was computed from the ASTER GED product [50] by using the following equation according to Cheng, et al. [51]: ε bb = 0.197 + 0.025ε 10 + 0.057ε 11 + 0.237ε 12 + 0.333ε 13 + 0.146ε 14 (2) In the equation, ε 10 -ε 14 are the surface narrowband emissivity data of ASTER bands 10-14, respectively.
The time resolution of these radiation measurements in the TP and HRB regions was 1 h and 30 min, and the radiation measurements with the closest observation time to the transit time of MODIS were selected for the verification process. Therefore, the time difference between the field observation and satellite overpass in the TP region was no more than 30 min, and the time difference in the HRB region was less than 15 min.

PMW LST Data Generation
In this study, the LUT-based AMSR-E LST retrieval algorithm proposed by Zhang and Cheng [40] was used to generate AMSR-E LST data. The main idea of this algorithm is to establish a comprehensive classification system of environmental variables (CCSEV). The data in the SRTM DEM, MCD12Q1, map of the desert distribution of China, and MYD10C1 datasets represent these factors for the establishment of the CCSEV. Then, the AMSR-E BT and the upscaled MODIS LST obtained by simple averaging were subjected to stepwise regression, and the retrieval formula for each CCSEV class was established separately. The accuracy (root mean square error (RMSE)) of the AMSR-E LST retrieved by this method was 2.65-3.48 K during the daytime and 2.15-2.94 K at nighttime. For the specific content of the LST retrieval algorithm, please refer to Zhang and Cheng [40].
The AMSR-E LST data were used in RF to achieve the fusion with the MODIS LST data. The AMSR-E LST data as PMW data represent the true situation of the cloudy-sky land surface. In order to obtain high spatial resolution all-weather LST, the AMSR-E LST data were downscaled to a spatial resolution consistent with the MODIS LST. A downscaling method based on the geographic weighted regression (GWR) model was used. This method sets a series of intermediate resolution levels between the initial resolution of 0.25 • and the target resolution of 0.01 • , and then uses the GWR model to establish the relationship between the LST and the scale conversion factor at each level in turn, and gradually downscales the AMSR-E LST from 0.25 • to 0.01 • . In this method, its scale conversion factor is usually NDVI data and elevation and slope data. Therefore, the NDVI data in MYD13A2, the elevation data in SRTM DEM, and the slope data generated by the elevation data were used. For more details on the GWR method, please see Zhang, et al. [52]. The downscaled AMSR-E LST data were generated by the above two methods.

LST Fusion Based on RF Method
LST is affected by many complex factors, so the selection of the best independent variable combination is crucial for LST fusion. Theoretically, the spatiotemporal pattern of LST is related to terrain, LC, soil moisture, and incoming solar radiation [41]. Therefore, in this study, elevation, NDVI, LC, longitude and latitude, and DSR were selected as candidate variables. The spatial distribution of LST is related to the topographical fluctuations in the study area, so elevation is a necessary predictor [53][54][55]. Since the vegetation-covered area accounts for 97.4% of the total area within the study area, which was calculated by LC data, NDVI data were also used as a necessary predictor to further describe the vegetation characteristics. Considering that the impact of LC on LST can be studied, LC was also selected as a candidate indicator [56].
RF, an integrated ML algorithm that evolved from the bagging algorithm, can be used for regression and classification research [34]. As a nonlinear method, RF consists of many decision trees. These decision trees are constructed from a randomly selected subset to lower the correlation between different decision trees [57,58]. The final output of the RF model is obtained by combining the results of all decision trees. The RF model has few parameter settings, fast training speed, high prediction accuracy, and can accurately capture the nonlinear interaction between variables [59]. In addition, the RF method is also an effective method to predict missing data, which can maintain accuracy even when most of the data are absent [60]. Thus, the RF method was chosen to accurately express the nonlinear relationship between LST and the important factors affecting it, and to achieve the purpose of using only clear-sky data to predict all-weather LST. However, because the input in the RF method is a series of values that does not contain spatial information, this method lacks the spatial information contained in the RS data. Longitude and latitude belong to spatial information and also reflect the difference of soil moisture and solar radiation, so its inclusion will likely improve the results of the fusion. DSR represents the differences in solar radiation at different latitudes, and it can also be used to characterize differences in spatial position.
Therefore, in this study, the candidate variables for the RF model include elevation, NDVI, LC, longitude and latitude, and DSR. Among them, elevation, NDVI, longitude and latitude, and DSR were directly used as independent variables, and LC was used to segregate data into different data bins. Four RF models composed of different candidate variables were tested, of which models ii-iv considered spatial information, as follows: (i) downscaled AMSR-E LST, elevation, and NDVI are the independent variables, clear-sky MODIS LST is the dependent variable; (ii) based on model i, latitude and longitude are added as independent variables; (iii) based on model ii, data are separated into different bins according to the LC type, and an RF model trained for each bin separately; (iv) based on model i, DSR is added as an independent variable, data are separated into different bins according to the LC type, and an RF model trained for each bin separately. Table 3 shows the variable selection status of the four candidate RF models. The RMSE was used as an indicator to select the best RF model among the four candidate models. The best selected RF model was used to achieve LST data fusion and thus predict all-weather LST data.
The most complicated model, model iii, was used as an example, and the flowchart for implementing model iii is shown in Figure 2. The process can be divided into training and prediction; before the training process begins, the input data were selected according to the availability of MODIS LST (using QC information as an indicator, see Section 2.2.1 for details). The steps in the training process were as follows: (1) Separate MODIS LST, downscaled AMSR-E LST, NDVI, elevation, and longitude and latitude data to different bins according to LC type in the MODIS LC data; (2) Use stratified random sampling to divide the input data of each bin into two parts: 80% of inputs from each bin were randomly selected as the training data, and the remaining 20% of inputs from each bin were reserved as verification data; (3) Train the RF model separately for each LC type; (4) Use the corresponding RF model to predict LST of each LC type separately in the remaining 20% of inputs; (5) Calculate the RMSE value of the predicted LST and the remaining MODIS LST, which was used to select the best RF model. The steps of the prediction process are as follows: (1) Separate the downscaled AMSR-E LST, NDVI, elevation, and longitude and latitude data into different bins according to different LC type; (2) Use the RF model obtained from the training process to predict all-weather LST. See Section 4.1 for the selection results of the best RF model.

Comparison of RF Model Results
Four RF models were used to fuse the daytime LST data for the year 2010. Figure 3 shows the training results of the RF models. The R 2 values of the four RF models were between 0.6 and 1, while the R 2 values of models ii and iii were both close to 1. The RMSE of the four RF models was between 1.5 K and 6 K, among which the RMSEs of models ii and iii were relatively low (between 1.5 K and 3 K), and the RMSEs of models i and iv were relatively high (between 2.5 K and 6 K). Model i, which did not consider spatial information, had the worst effect, including the lowest R 2 value and the highest RMSE. Models ii and iii, containing longitude and latitude information, performed relatively well. Compared to model ii, model iii performed best because it not only considered the effects of NDVI, elevation, and latitude and longitude, but also modeled different models for the data corresponding to different LC types. This may be because the temperature generation mechanism varies with the LC type, and thus modeling each LC type separately will slightly improve the accuracy of the fusion result. Since the DSR data in model iv can only represent changes in latitude, their accuracy is lower than that of model iii. Therefore, model iii was selected as the final model to fuse the MODIS and AMSR-E LST data.

The Effect of the Fused LST
The effect of the fused LST was investigated from two aspects: qualitative analysis and quantitative verification. We adopted days of the year to represent the twelve months for MODIS LST. Spatial patterns of MODIS LST during daytime and nighttime are shown in Figures 4 and 5, respectively. Severe data loss was observed on each day due to the impact of cloud contamination. Figures 6 and 7 show the spatial patterns of the fusion results corresponding to these MODIS LST data. Compared with MODIS LST, the spatial integrity of the fusion results was greatly improved, except for some blank areas caused by the orbit gap of the AMSR-E sensor. In addition, in terms of time, the magnitude of LST data in these images gradually increases from January to July, and gradually decreases from August to December. In space, the LST in northeast China and the Qinghai-Tibet Plateau is relatively low, while the LST in south China and the desert areas in northwest China is relatively high. This indicates that the magnitude of LST data in the fusion results is consistent with the general spatiotemporal variations of LST.    Lastly, we tested the fused LST using the reserved 20% MODIS LST data (Figure 8). We found that the nighttime RMSE was smaller than the daytime RMSE, with the daytime RMSE around 2 K, and the nighttime RMSE around 1 K.

LST Verification
The in situ temperature data calculated by the surface longwave upwelling and downward radiation were used in the verification process. For comparison, both MODIS LST data and fused LST were verified. Figure 9 shows the scatter plots of the in situ temperature and MODIS LST. The RMSE of MODIS LST was 3.20 K, 4.44 K, 2.18 K, and 2.53 K at the BJ, NAMORS, AR, and YK stations, respectively. The scatter plots of the in situ temperature and fused LST are provided in Figure 10. At the BJ, NAMORS, AR, and YK stations, the RMSE of clear-sky fused LST was 3.18 K, 4.50 K, 2.12 K, and 2.64 K, which was similar to the RMSE of MODIS LST, indicating similar accuracy of the clear-sky fused LST and MODIS LST. The RMSE of cloudy-sky fused LST was 3.92 K, 4.89 K, 3.87 K, and 3.45 K, respectively. The accuracy of cloudy-sky fused LST was lower than that of clear-sky fused LST. This is because the accuracy of PMW LST is relatively low under conditions of cloud coverage [20]. However, these cloudy-sky RMSE values can be compared with the cloudy-sky RMSE values of previous machine learning methods. A neural network retrieval method proposed by Aires, et al. [61] has an RMSE value of about 3.1-5 K under cloudy conditions in mid-latitude regions [62]. The machine learning method based on artificial neural network (ANN) models used by Shwetha and Kumar [23] has RMSE values of 2.9-6.2 K for cloudy sky during daytime, and 2.1-3.3 K for cloudy sky during nighttime. In addition, the bias indicates that the cloudy-sky fused LST at all sites was lower than the in situ temperature, which may be related to the slightly deeper thermal sampling depth of the PMW radiation during PMW data collection [14].

The Daily Variation of the Fusion LST
The daily variation of fused LST at all sites during 2010 is shown in Figure 11. For reference, the daily variations of the in situ temperature and MODIS LST are also included in this figure. In each figure, the trend of the fused LST time series is close to the trend of the in situ temperature time series and MODIS LST time series, and is consistent with the correct annual LST trend. Therefore, it can be concluded that the LST fused by the RF method can capture the correct time variation of the LST at each site. The deviation of the daytime fused LST is relatively large, which can also be explained by the slightly deeper thermal sampling depth of PMW radiation [14].

Improvement of Integrity
As shown in Figures 6 and 7, due to the orbital gap of the AMSR-E sensor, the fused LSTs still have missing values. To further improve the effectiveness of the fusion results, the data interpolating empirical orthogonal function (DINEOF) method was used. The DINEOF method was proposed by Beckers and Rixen in 2003 and is often used to deal with the problem of missing data [63]. Compared with traditional interpolation methods, the DINEOF method requires fewer input parameters and has higher computational efficiency [64]. This method has been used in many studies and reliable results have been obtained with it [64][65][66][67][68][69][70][71]. Figures 12 and 13 show the spatial distributions of the complete all-weather LST in different months, indicating that the DINEOF method effectively improves the integrity of the fusion result. In addition, in order to further evaluate the performance of the DINEOF method for filling LST values in the satellite orbit gap, the in situ temperature data were used to verify these complete all-weather LST data, shown in Figures 12 and 13. The scatter plot of the in situ temperature and the all-weather LST data is shown in Figure 14. The RMSE of all-weather LST was 3.97 K, which was similar to the average RMSE of the fused LST (3.57 K), indicating that by combining the RF fusion method and the DINEOF method, the complete all-weather LST with high spatial resolution can be generated.

Effects of Missing Value Proportion
It can be seen from Figure 8 that the RMSE of the fusion results varies with the date, which may be related to the missing value proportions. Figure 15 shows the scatter plot of missing value proportions and RMSE, and highlights that RMSE has a positive relationship with the missing value proportions. For dates with a large missing value proportion, the accuracy of the fusion result was generally low, and vice versa.

Variable Importance Measure
To investigate the contribution of the input variables in the selected models to the fusion results, the variable importance measures provided by the RT method were used. As shown in Figure 16, downscaled AMSR-E LST and latitude had a significant effect on the LST estimates, while NDVI and elevation had the least effect on the LST estimates, except for snow and ice, as well as the water LC type. This can be attributed to the contribution already made by NDVI and elevation data during the PMW LST downscaling process.

Accuracy Comparison with Downscaled AMSR-E LST
Methods for obtaining complete high-resolution LST can be divided into two categories: kernel-driven methods, which downscale LST through auxiliary data to obtain high-resolution LST; and fusion-based methods, which predict all-weather high-resolution LST by integrating information from different sensors [72]. In this study, the downscaled AMSR-E LST was a high-resolution LST obtained by the kernel-driven method; the fused LST is an all-weather high-resolution LST predicted by the fusion-based method.
In order to explore the necessity of the RF fusion process, MODIS LST was used as verification data to verify the fusion results and the downscaled AMSR-E LST. Figure 17a,b are the scatter density plots of the fusion results and the downscaled AMSR-E LST, respectively, where the first row is daytime data and the second row is nighttime data. It can be seen from Figure 17 that the scatter points of the fusion results are closely distributed around the 1:1 line, whereas the downscaled AMSR-E LSTs are more scattered during both the daytime and nighttime. In addition, the RMSE was obtained and compared to the reserved 20% MODIS LST. The RMSE of the daytime and nighttime LST data obtained by directly downscaling the PMW LST was 5.75 K and 3.48 K, respectively. The RMSE of the daytime and nighttime LST obtained by fusing the TIR and PMW LST data was significantly reduced by 62.21% and 71.87%, respectively, and its RMSE was 2.17 K and 0.98 K. Therefore, in order to obtain more accurate all-weather high-resolution LST, the process of fusing TIR and PMW LST data with the RF method is necessary.

Conclusions
In this study, the RF model was used to fuse MODIS and AMSR-E LST in mainland China during 2010. The RF model performed best when LC type, terrain, vegetation conditions, moisture conditions, and solar radiation were considered. The magnitude of LST data in the fusion result is consistent with the general spatiotemporal variation of LST.
In order to further evaluate the effectiveness of the RF model, the in situ measurements obtained from the central TP region and upper and middle reaches of the HRB region were used to verify the fused LST. The RMSE of clear-sky fused LST and cloudy-sky fused LST were 2.12-4.50 K and 3.45-4.89 K, respectively. According to the fused LST images of China on the 15th day of each month in 2010 and the time series of fused LST at the verification sites, we found that the fused results of the RF model accurately reflected the spatiotemporal change trend of LST. To further improve the usability of the all-weather LST, the DINEOF method was used to obtain a complete all-weather LST. By exploring the relationship between the RMSE and the missing value proportions, we found that high RMSEs usually corresponded to a large missing value proportion and vice versa. With reference to the variable importance measures, it can be seen that the downscaled AMSR-E LST and latitude have the most significant impact on LST estimation. Compared with the high-resolution LST obtained through downscaling of AMSR-E data, the fusion method of estimating LST had higher accuracy, indicating that it is necessary to use the RF fusion method. The proposed method effectively fuses TIR and PMW LST data, thereby generating all-weather LST data with high spatial resolution.