Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data

Zhang, Han-Hao; Jiang, Geng-Ming

doi:10.3390/rs18121954

Open AccessArticle

Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data

by

Han-Hao Zhang

¹ and

Geng-Ming Jiang

^1,2,*

¹

College of Future Information Technology, Fudan University, Shanghai 200433, China

²

Key Laboratory for Information Science of Electromagnetic Waves, Fudan University, Shanghai 200433, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(12), 1954; https://doi.org/10.3390/rs18121954 (registering DOI)

Submission received: 20 April 2026 / Revised: 9 June 2026 / Accepted: 10 June 2026 / Published: 12 June 2026

(This article belongs to the Special Issue Land Surface Temperature Retrieval and Cross-Validation for Remote Sensing Applications)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

An improved split-window algorithm is developed using numerical radiative transfer simulation experiments and is successfully applied to retrieving accurate clear-sky land surface temperature (LST) from MERSI-II/FY-3D data.
A hybrid method combining the eXtreme Gradient Boosting (XGBoost) model and surface energy balance theory is proposed to estimate cloudy-sky LSTs from MERSI-II/FY-3D data. The XGBoost model is used to estimate hypothetical clear-sky LSTs, while the surface energy balance theory is employed to correct cloud radiation effect, enabling accurate LST retrieval under cloudy-sky conditions.

What are the implications of the main findings?

The results of clear-sky LST retrieval indicate that MERSI-II/FY-3D data are reliable and can be used to produce clear-sky LSTs at a level comparable to well-established satellite products. This matters because it strengthens confidence in using FY-3D as an independent or complementary data source, which is valuable for continuity when MODIS data are unavailable or for cross-validation in long-term climate records.
The combination of machine learning (XGBoost) with surface energy balance theory demonstrates a successful fusion of data-driven and physics-based approaches. This is important because purely statistical models often lack physical interpretability, while purely physical models struggle under complex conditions like clouds. The hybrid method effectively reconstructs LST under cloudy-sky conditions with good accuracy. This shows that combining the two can improve not only retrieval accuracy but also spatial coverage.

Abstract

Land surface temperature (LST) is a key variable in the physics of land surface processes on both regional and global scales. This paper addresses the all-sky (clear-sky and cloudy-sky) LSTs retrieval from the data acquired by the Medium-Resolution Spectral Imager II on Fengyun 3D (FY-3D) satellite. First, an improved split-window algorithm to retrieve clear-sky LSTs is developed using numerical radiative transfer modeling experiments. Then, clear-sky LSTs are retrieved from MERSI-II/FY-3D data in January and July 2022 over an Asian area (70°E~130°E, 10°N~50°N), and cross-validated against MODIS/Aqua LST/emissivity (LST/E) Daily version 6 (MYD11C1 V6) product. Next, a hybrid method combining the eXtreme Gradient Boosting (XGBoost) model and the surface energy balance theory is developed to estimate cloudy-sky LSTs. After that, cloudy-sky LSTs are estimated from the MERSI-II data and validated with the China Meteorological Administration Land Data Assimilation System Version 2 (CLDAS V2) dataset. Against the MYD11C1 LSTs, the root mean square error (RMSE), bias and coefficient of determination (R²) of the retrieved clear-sky LSTs are 1.15 K, 0.01 ± 1.14 K, and 0.99, respectively. Against the CLDAS LSTs, the RMSE, bias and R² of the estimated hypothetical clear-sky LSTs are 4.05 K, 0.75 ± 3.98 K and 0.91, respectively, while they are 3.69 K, 0.36 ± 3.67 K, and 0.92 for the retrieved cloudy-sky LSTs, respectively, which indicates that the retrieval accuracy of cloudy-sky LSTs is improved after the cloud radiation effect correction. The all-sky LSTs retrieved in this study are accurate and consistent with the results in previous studies.

Keywords:

MERSI-II/FY-3D data; all-sky land surface temperature; split-window algorithm; XGBoost model; cloud radiation effect correction

1. Introduction

Land surface temperature (LST) is a key variable in the physics of land surface processes on both regional and global scales [1]. It can be used to monitor surface thermal anomalies, urban heat islands, and thermal pollution [2], and is also closely related to ecological variables such as soil moisture, crop drought stress, and evapotranspiration [3].

At present, LSTs can be obtained through various approaches, including in situ measurements, mobile measurements, craft-based sensing, and satellite remote sensing [4]. Although the first three approaches can avoid cloud contamination and measure LST accurately [5], they are often costly and limited in spatiotemporal coverage, making them unsuitable for large-scale studies. The satellite remote sensing has become an important approach for retrieving LST at global and regional scales because of its wide spatial coverage, high sampling density, repeated observations, and relatively low cost [6]. To retrieve accurate LSTs, various satellite remote sensing techniques have been developed, among which microwave remote sensing and thermal infrared remote sensing are the most widely used. The satellite microwave remote sensing can be applied to LST retrieval in a variety of weather conditions [7,8]. However, its spatial resolution and retrieval accuracy are generally limited. The satellite thermal infrared remote sensing provides higher LST retrieval accuracy and finer spatial resolution, making it suitable for analysis at finer scales. But satellite thermal infrared observations are highly sensitive to clouds and are therefore limited to clear-sky conditions [9,10]. The absence of LST under cloudy skies reduces the spatiotemporal continuity of LST products, making the retrieval of cloudy-sky LST increasingly important, in addition to clear-sky LSTs.

For clear-sky LST retrieval from thermal infrared satellite data, scientists have developed several algorithms: the single-channel algorithm [11,12], the split-window algorithm [13,14], and the multi-channel algorithm [15,16]. Compared to the single-channel and the multi-channel algorithms, the split-window algorithm is much more widely used for clear-sky LST retrieval due to its ease of implementation and high accuracy [13,17,18,19,20,21,22].

Compared to clear-sky LST retrieval, cloudy-sky LST retrieval from satellite thermal infrared data is a challenge. Nevertheless, many methods have been proposed to reconstruct LST beneath clouds, and they are roughly grouped into four categories: the spatiotemporal interpolation method [23,24], the data-fusion method [25,26], the surface energy balance (SEB)-based method [27,28], and the machine-learning method. The spatiotemporal interpolation method performs temporal reconstruction, followed by spatial reconstruction, exploiting nearby spatiotemporal samples and their statistical relationships to fill LST gaps for cloudy regions [23,24,29]. This method is simple and direct, but its accuracy is usually very low. The data fusion method makes full use of the complementary advantages of thermal infrared data and passive microwave data: thermal infrared data provide high spatial resolution information of LSTs under clear-sky conditions, while passive microwave data contain low spatial resolution information of LSTs beneath clouds [25,26]. The significant differences in physical mechanisms between thermal infrared and microwave remote sensing make it impossible to produce robust and accurate cloudy-sky LSTs, especially for complex environments. Based on the assumption that the difference between the cloudy-sky LST and the nearby clear-sky LST is mainly due to the attenuation of incoming solar short-wave radiation by clouds, the SEB-based method estimates cloudy-sky LSTs by combining the surface energy balance equation and environmental variables such as solar radiation, albedo, wind speed, and so on [27,28]. Although the SEB-based method is relatively complex and its retrieval accuracy depends heavily on input environmental variables, its clear physical mechanisms and interpretability make it more suitable for cloudy-sky LST retrieval in complex environments, and recent research reported that good results were obtained by the SEB-based method [28]. In recent years, the machine-learning method has gained popularity due to its high accuracy and computational efficiency, and has also provided a new way to describe the complex relationship between the cloudy-sky LST and radiation factors [30,31]. Among the machine-learning algorithms, the eXtreme Gradient Boosting (XGBoost) method achieves high predictive accuracy by iteratively optimizing residuals, and it is also computationally efficient and highly scalable due to optimizations such as parallel processing, column sampling, and regularization [32]. It is therefore well suited for modeling the nonlinear relationship between cloudy-sky LST and satellite measurements. Currently, hybrid algorithms that combine machine learning and physical modeling show great potential for cloudy-sky LST retrieval.

The Fengyun-3D (FY-3D) is the fourth satellite of the FY-3 series. On FY-3D, there are eleven payloads, among which the Medium-Resolution Spectral Imager II (MERSI-II) is an improved version of the MERSI aboard FY-3A/B/C [33]. Table 1 summarizes the channel specifications of MERSI-II/FY-3D, including spectral, radiometric, and geometric characteristics. MERSI-II/FY-3D has 25 channels covering the visible, near-infrared, mid-infrared, and thermal infrared spectra. Except for the channels 20–23, which have a spatial resolution of 1000 m, it is the world’s first imaging instrument that can obtain global thermal infrared measurements in the split-window channels with a spatial resolution of 250 m. For the channels 20–25, the brightness temperature ranges between 180 K and 350 K, while the noise equivalent temperature difference (NEΔT) is 0.25–0.40 K at 270 K. The improved MERSI-II instrument enhances the capability to observe surface and atmospheric variables with greater detail. Based on its channel configuration, MERSI-II/FY-3D data are well suitable for LST retrieval.

To retrieve accurate all-sky (clear-sky and cloudy-sky) LSTs from MERSI-II/FY-3D data, a hybrid algorithm is proposed in this paper, in which clear-sky LSTs are retrieved using the generalized split-window algorithm, while cloudy-sky LSTs are estimated using a method that integrates the XGBoost algorithm and SEB theory. In the following, Section 2 describes the study area, datasets, and data processing; Section 3 presents the proposed method; Section 4 gives the results and analysis of all-sky LST retrieval and accuracy validation; and Section 5 discusses the limitation of the proposed method and the work to do in the future. The final section is devoted to the summary and conclusions.

2. Study Area, Data and Data Processing

2.1. Study Area and Time Periods

In this study, an Asian area ranging from 70°E to 130°E and from 10°N to 50°N is selected as the study area, which covers most of China as well as surrounding regions in East, South and Southeast Asia (Figure 1). The study area has diverse land cover types, including trees, dense short vegetation, cropland, open waters, desert, semi-arid areas, etc., and it also encompasses various terrains: plateaus, mountains, and plains. The diversity of terrain and land cover types of the study area facilitates the retrieval and analysis of all-sky LSTs.

Two typical time periods, January 2022 and July 2022, are selected in this study, which are representative of winter and summer conditions, respectively.

According to MERSI-II/FY-3D cloud mask product and the China Meteorological Administration Land Data Assimilation System Version 2 (CLDAS V2) data, which will be introduced in Section 2.2, the percentages of clear-sky and cloudy-sky grids of land surfaces and the mean LSTs across three latitude ranges, [10.0°N, 25.0°N), [25.0°N, 40.0°N) and [40.0°N, 50.0°N], are calculated for the two months. In this study, the clear-sky grids include those composed entirely of absolutely clear-sky observations, as well as those containing both absolutely clear-sky and probably clear-sky observations, provided that the proportion of absolutely clear-sky observations exceeds 50%. Otherwise, the grids are classified as cloudy-sky grids. The statistical results are listed in Table 2. In January 2022, the proportion of clear-sky grids ranges from 31.4% to 43.9%, with the mid-latitude region [25.0°N, 40.0°N) exhibiting the highest cloud coverage, while the mean LST varies from 258.59 K to 293.57 K and reaches its maximum in the low-latitude region [10.0°N, 25.0°N). In July 2022, cloud coverage further increases, particularly in the low-latitude region [10.0°N, 25.0°N), where the cloudy-sky grids account for 96.5% of the total, whereas the mean LST ranges from 294.30 K to 300.39 K, with the lowest value occurring in the mid-latitude region [25.0°N, 40.0°N). The results indicate that cloud contamination is widespread across the study area, especially in July, highlighting the importance of developing reliable cloudy-sky LST retrieval methods. The pronounced spatial and monthly variations in cloud cover and LST between January and July 2022 also demonstrate the representativeness of the selected study area and study periods.

2.2. Data Description and Processing

To retrieve all-sky LSTs, the following data are used: the MERSI-II/FY-3D data (https://satellite.nsmc.org.cn, accessed on 10 March 2025), the MODIS/Aqua data (https://search.earthdata.nasa.gov, accessed on 25 March 2025), the fifth generation of European Center for Medium-range Weather Forecast (ECMWF) atmospheric reanalysis (ERA5) data (https://cds.climate.copernicus.eu, accessed on 5 April 2025), the Global LAnd Surface Satellite (GLASS) data (http://www.glass.umd.edu, accessed on 30 April 2025), the Global 30 Arc-Second Elevation (GTOPO30) data (https://www.usgs.gov, accessed on 15 April 2025), and the CLDAS V2 data (https://data.cma.cn, accessed on 10 June 2025).

The MERSI-II/FY-3D data include MERSI-II/FY-3D L1 data and MERSI-II/FY-3D cloud mask data. The MERSI-II/FY-3D L1 data provide the observations at top-of-atmosphere (TOA) in the channels 1~25 (Table 1), geolocation, observation time, viewing zenith angle, solar zenith angle, etc. The MERSI-II/FY-3D cloud mask data are generated from the MERSI-II/FY-3D L1 data using a multi-feature thresholding method, and provide the cloud mask flag (Cloud_Mask), cloud mask quality assurance (Cloud_Mask_QA), and cirrus mask (Cirrus_Mask) for each observation, which are used to determine whether the observations are collected under clear-sky conditions or not. All MERSI-II/FY-3D data are resampled to the 0.05° geographic Climate Modeling Grid (CMG) using the average method.

The MODIS/Aqua data used in this study include MODIS/Aqua LST/emissivity (LST/E) daily (MYD11C1) version 6 (V6) product, MODIS/Aqua LST/E 8-day (MYD11C2) V6 product, and the MODIS/Aqua Vegetation Indices Monthly L3 (MYD13C2) V6 product, which are projected on the 0.05-degree geographic CMG. The MODIS/Aqua data products are selected because both Aqua and FY-3D are afternoon satellites, which helps reduce the phase differences and systematic errors induced by diurnal variation. Previous validation studies against in situ measurements collected at thermally homogeneous sites have demonstrated that the MODIS LST product achieves root mean square errors (RMSE) of approximately 1.3 K and 2.0 K under daytime and nighttime conditions, respectively [34]. Given its well-established accuracy and extensive validation, the MYD11C1 V6 LST product is used as the reference dataset for evaluating the clear-sky LSTs retrieved from MERSI-II/FY-3D data, while the MYD11C2 V6 LSEs are utilized to estimate the LSEs in MERSI-II/FY-3D channels 24 and 25 using the Baseline Fit (BF) method [35,36]. The MYD13C2 V6 product provides monthly vegetation indices, in which the Normalized Difference Vegetation Index (NDVI) serves as one of the input features for hypothetical clear-sky LST estimation.

The ERA5 data provide hourly estimates of a large number of atmospheric, land and oceanic climate variables with longitude and latitude resolutions of 0.25° × 0.25°, and resolve the atmosphere using 37 levels from the surface up to a height of 80 km [37]. The candidate ERA5 variables used in this study include the surface downward shortwave radiation (SDSR), surface downward longwave radiation (SDLR), surface upward longwave radiation (SULR), Total Precipitation Water (TPW), 2 m temperature (T_a), surface pressure (P_s), soil moisture (SM), and wind speed (WS). The ERA5 variables are primarily used to calculate the differences in radiative fluxes between clear-sky and cloudy-sky conditions, thereby estimating the influence of clouds on LST [38]. They are also utilized to estimate hypothetical clear-sky LSTs. In addition, the ERA5 TPW is involved in the clear-sky LST retrieval. The ERA5 data are resampled to the 0.05° geographic Climate Modeling Grid (CMG) using the bilinear interpolation method.

The GLASS data provide a series of high-quality land surface biophysical parameters at a spatial resolution of 0.05° in both longitude and latitude, and they have been widely used in the remote sensing community in recent years [39]. To correct the cloud radiation effect, the blue-sky albedo (α), Broad Band Emissivity (BBE) and Leaf Area Index (LAI) are used in this study, in which the blue-sky albedo is a key input to estimate the hypothetical clear-sky LSTs, while the BBE and LAI are used for cloud effect correction.

The GTOPO30 data, released by the United States Geological Survey (USGS), provide Earth surface elevation with a horizontal grid spacing of 30 arc-seconds (approximately 1 km). To reflect the negative influence of elevation on LSTs, the GTOPO30 surface elevation is used as one of the input features for hypothetical clear-sky LST estimation in this study. The GTOPO30 surface elevations are resampled to the 0.05° geographic CMG using the average method.

The CLDAS V2 product is generated from in situ observations, multi-source satellite retrievals, and numerical model analysis using variational and other data assimilation techniques, and it provides all-sky LSTs in the Asia region (approximately 0–65°N and 60°E–160°E) with temporal and spatial resolutions of 1 h and 0.0625°, respectively. The CLDAS V2 LST product was evaluated using quality-controlled observations from the operational automatic LST station network across China. The evaluation results showed good agreement between the CLDAS LSTs and ground measurements, with a nationwide average correlation coefficient of 0.98, a RMSE of 1.8 K, and a bias of 1.4 K (https://data.cma.cn/data/cdcdetail/dataCode/NAFP_CLDAS2.0_NRT.html, accessed on 20 October 2025). Although relatively larger errors were observed in the permafrost region of Qinghai–Tibet Plateau [40], the CLDAS V2 LST product is still a reliable reference dataset for validation of the retrieved cloudy-sky LSTs. The CLDAS LSTs are resampled to the 0.05° geographic CMG using the average method.

3. Methods

In this study, the clear-sky LSTs are retrieved using the improved split-window algorithm [14], while the cloudy-sky LSTs are estimated using a hybrid method that integrates the eXtreme Gradient Boosting (XGBoost) version 3.1.2 model and the SEB theory.

3.1. Development of the Split-Window Algorithm

The improved split-window algorithm is given by [14]:

T_{s} = C + (A_{1} + A_{2} \frac{1 - ε}{ε} + A_{3} \frac{Δ ε}{ε^{2}}) \frac{T_{i} + T_{j}}{2} + (B_{1} + B_{2} \frac{1 - ε}{ε} + B_{3} \frac{Δ ε}{ε^{2}}) \frac{T_{i} - T_{j}}{2} + D {(T_{i} - T_{j})}^{2}

(1)

with

ε = (ε_{i} + ε_{j}) / 2

and

Δ ε = ε_{i} - ε_{j}

. Where T_s is LST, and T_i and T_j are the brightness temperatures at top-of-atmosphere (TOA) in the channel i centered at ~11.0 μm and in the channel j centered at ~12.0 μm, respectively; ε_i and ε_j denote the LSEs in the channels i and j, respectively; and C, A₁, A₂, A₃, B₁, B₂, B₃, and D are unknown coefficients. In this study, i and j correspond to MERSI-II/FY-3D channels 24 and 25, respectively.

To determine the eight unknown coefficients in Equation (1), a large number of samples of brightness temperatures, LST, and LSEs with spatiotemporal representativeness are required. However, it is difficult in practice to obtain the samples, and they are therefore usually constructed using numerical radiative transfer modeling experiments [18,19,20,21]. A previous study showed that simulations in the split-window channels agree well with satellite measurements, with mean differences generally within ±0.3 K [36]. In this study, the numerical experiments are conducted using the moderate spectral resolution atmospheric transmittance algorithm and computer model (MODTRAN) [41] fed with the SeeBor V5 dataset [42]. First, the clear-sky atmospheric profiles over land surfaces are extracted from the SeeBor V5 dataset. Then, for each profile, seven VZAs (0.0°, 27.5°, 40.0°, 47.5°, 53.75°, 60.0°, and 65.0°) are set, and the atmospheric transmittance, upwelling radiance, and the atmospheric downwelling radiance are calculated. Next, LST and LSE are set for each profile. The LST goes from T₀ − 5 K to T₀ + 20 K with a step of 5 K, where T₀ is the near-surface air temperature, while the LSE mean (ε) changes from 0.90 to 1.00 with a step of 0.02, and the LSE difference (Δε) varies from −0.025 to 0.015 with a step of 0.005 [18,19,20,21]. After that, the brightness temperatures at TOA in MERSI-II/FY-3D channels 24 and 25 are calculated according to the radiative transfer equation. A sample dataset of LST related to the brightness temperatures, LSEs TPW and VZA has been constructed, in which the TPWs are also extracted from the SeeBor V5 dataset. Finally, the eight unknown coefficients in Equation (1) are determined using multivariable linear regression.

To reduce the fitting errors, the simulation samples are grouped into several sub-ranges in terms of TPW and ε, and the eight coefficients in Equation (1) of each sub-range are obtained using the linear regression. The TPW is divided into six sub-ranges with an overlap of 1.0 cm: [0, 2.0], [1.0, 3.0], [2.0, 4.0], [3.0, 5.0], [4.0, 6.0], and [5.0, 7.0] cm; the LSE mean (ε) was divided into two sub-ranges with an overlap of 0.02: [0.90, 0.96] and [0.94, 1.00]. The overlaps are introduced to account for uncertainties in the input variables. During LST retrieval, the centers of the overlap intervals are used as the splitting points for determining the sub-range to which each input variable belongs. For example, when the TPW is 2 cm, the coefficients corresponding to the 1–3 cm interval are adopted for the retrieval. Figure 2 shows the fitting RMSEs varying with VZA in the sub-ranges of TPW and ε. The RMSE ranges between 0.40 K and 2.72 K, and increases the most with the increase in VZA, followed by TPW and ε. For VZA less than 30° or TPW less than 2.0 cm, the RMSEs are less than 1.0 K.

With the coefficients of the TPW and ε sub-ranges, the brightness temperatures, the LSEs, the VZA and TPW, an initial LST can be obtained. To further improve the fitting accuracy, LST is divided into four groups, including ≤282.5 K, [277.5, 297.5] K, [292.5, 312.5] K, and ≥307.5 K, and the unknown coefficients in Equation (1) in the sub-ranges of LST, TPW and ε are determined. Figure 3 displays the fitting RMSEs varying with VZA in the sub-ranges of LST, TPW and ε. The RMSE ranges between 0.36 K and 2.72 K, and increases with the increase in LST, TPW and ε. Although the minimum and maximum RMSEs have no significant reduction, much smaller RMSEs are obtained for the sub-ranges with low LST, TPW and VZA. For the sub-ranges with LST less than 282.5 K and VZA less than 60° and the sub-ranges with VZA less than 25° and TPW less than 2.0 cm, the RMSEs are less than 1.0 K. In general, the overall RMSEs are reduced with the LST, TPW and ε groups.

Once the coefficients are determined, the clear-sky LST retrieval is performed in two steps: First, an initial LST is estimated from the MERSI-II data using the split-window algorithm with coefficients of the sub-ranges of TPW and ε. This initial LST estimate is then used to determine the appropriate LST sub-range. In the second step, the final LST is retrieved using the split-window algorithm with coefficients corresponding to the selected sub-ranges of LST, TPW, and LSE, in which the TPW is extracted from the ERA5 data, while the LSEs are calculated from the MYD11C2 data using the Baseline Fit (BF) method [35]. This two-step strategy enables the use of LST-dependent coefficients while avoiding the need for prior knowledge of the actual LST.

3.2. Development of Cloudy-Sky LST Estimation Method

The cloudy-sky LST can be expressed as the sum of hypothetical clear-sky LST (T_{s,hypothetical}) and the temperature increment induced by cloud radiation effect (ΔT) [43], i.e.,

T_{s, cloudy} = T_{s,hypothetical} + Δ T

(2)

3.2.1. Hypothetical Clear-Sky LST Estimation Method

LST is a nonlinear function of radiation energy input, land-surface biophysical characteristics, and atmospheric states [44], and thus, it is difficult to describe their relation using a simple analytical expression, especially under cloudy-sky conditions. In this study, a hypothetical clear-sky LST estimation model employing an XGBoost model is developed, in which the Bayesian optimization algorithm is applied to automatic hyperparameter searching.

To train the XGBoost model, a sample dataset with spatiotemporal representativeness is needed. In this study, the clear-sky LSTs retrieved from the MERSI-II data using the improved split-window algorithm are used as model labels. The selection of input features has a strong impact on the model’s predicting performance. To adequately characterize the key factors controlling LST, the model input features are selected from four aspects: radiation energy balance, atmospheric state, surface characteristics, and other spatiotemporal variables. For the radiation energy balance, the following candidate variables are considered: the clear-sky ERA5 surface downward shortwave radiation (SDSR^clr, where the “clr” denotes clear-sky), the ERA5 surface downward longwave radiation (SDLR^clr), and the blue-sky albedo (α) extracted from the GLASS data, which jointly characterize the input and output energy of land surfaces under clear-sky conditions, i.e., they directly control LST. The candidate atmospheric-state features are 2 m temperature (T_a), TPW, surface pressure (P_s), and wind speed (WS), which describe the near-surface thermal and moisture background. The candidate surface-characteristic variables include the Normalized Difference Vegetation Index (NDVI) and soil moisture (SM), which capture the influences of vegetation and surface wetness on energy partitioning and evaporative cooling. The other candidate spatiotemporal variables are the local time (t), longitude (Lon), latitude (Lat), and GTOPO30 Earth surface elevation (h).

To collect the samples, the following two criteria are applied: collocation in the 0.05° geographic CMG space and the absolute time difference of less than 30 min. However, too many samples are collected with the above two criteria, which may have a negative impact on the model. To reduce the size of the dataset while maintaining the spatiotemporal representativeness of the dataset, twelve days are selected with the largest portion of clear-sky regions in the early, middle and later of January 2022 and July 2022, and a total of 11,322,766 samples are collected. The sample dataset is randomly divided into a training set, a hyperparameter-optimization set, and an independent test set with a ratio of 6:1:3. The hyperparameter-optimization set is used for automatic tuning in the Bayesian optimization stage, in which the posterior distribution is iteratively updated within a predefined hyperparameter search space and the next evaluation point is selected. Once the optimal hyperparameters are obtained, the XGBoost model is fitted using the training set, and then its predicting performance is evaluated using the test set.

The feature selection is conducted in terms of the normalized gain importance metric from XGBoost. Figure 4 demonstrates the importance of all input features. For predicting hypothetical clear-sky LSTs, T_a, SDSR^clr, SDLR^clr, and TPW have the largest impacts in sequence, whereas WS has the weakest impact. Thus, the WS is removed from the candidate prediction features.

The multicollinearity among the candidate prediction features is evaluated using the Variance Inflation Factor (VIF), which is given by [45]

V I F_{i} = \frac{1}{1 - R_{i}^{2}}

(3)

where

R_{i}^{2}

is obtained by regressing the ith candidate prediction feature, x_i, against all other input features.

Table 3 presents the VIF values of the candidate prediction features. Most features exhibit low to moderate multicollinearity, with VIF values below 10. The highest VIF values are observed for surface pressure (P_s, 47.80) and surface elevation (h, 47.78), indicating strong collinearity between these two features, which is physically consistent with the close relationship between atmospheric pressure and elevation. Surface elevation primarily influences cloud-sky LST retrieval indirectly by shaping the overall atmospheric environment and surface heterogeneity, while surface pressure directly controls atmospheric optical depth and radiative transfer behavior, making it more critical for the physically based correction schemes. Experimental results show that incorporating both surface pressure and surface elevation improves model performance. Therefore, both features are selected as input predictors in the model. Overall, the results suggest that multicollinearity among the remaining features is not severe and is unlikely to significantly affect model performance. In addition, the XGBoost model is generally less sensitive to multicollinearity than linear regression because it is tree-based. The candidate prediction features provide complementary information for hypothetical clear-sky LST estimation.

Therefore, the following features are finally selected as the inputs of the XGBoost model: T_a, SDSR^clr, SDLR^clr, TPW, t, NDVI, α, Lon, Lat, P_s, h, and SM; the hypothetical clear-sky LST estimation model is expressed as

T_{s,hypothetical} = f (T_{a}, S D S R^{clr}, S D L R^{clr}, TPW, t, NDVI, α, Lon, Lat, P_{s}, h, SM)

(4)

where f(·) is the XGBoost model with optimal hyperparameter configuration obtained by Bayesian optimization.

With the selected input features, the XGBoost model is retrained using the training set and then tested using the test set. As shown in Figure 5, the hypothetical clear-sky LSTs estimated using the XGBoost model agree well with the clear-sky LSTs retrieved from the MERSI-II/FY-3D data using the improved split-window algorithm: most of the scatters are distributed around the 1:1 diagonal, and the bias, the RMSE and the coefficient of determination (R²) are 0.0 K, 1.3 K and 0.99, respectively. The testing results reveal that the XGBoost model developed in this study can accurately estimate the hypothetical clear-sky LSTs without obvious systematic bias.

3.2.2. Cloud Radiation Effect Correction Method

The temperature increment induced by cloud radiation effect (ΔT) can be estimated using the surface energy balance theory. The surface energy balance equation is expressed by [46]

R_{net} = S_{net} + L_{net} = G + S_{h l e}

(5)

where R_net is the net surface radiation; S_net and L_net are the net shortwave radiation and net longwave radiation, respectively; G is the ground heat flux, and S_hle is the sum of sensible and latent heat fluxes.

According to the force-restore model [47], the ground heat flux is a function of the difference between surface temperature and subsurface soil temperature:

G = \frac{k_{g}}{Δ z} (T_{s} - T_{d})

(6)

where k_g is the soil thermal conductivity, T_d is the subsurface soil temperature, and Δz is the subsurface depth, which is approximately 0.1 m [47].

Because T_d is much less sensitive to solar irradiation than LST, the partial derivative of G with respect to LST can be approximated by [46]

\frac{\partial G}{\partial T_{s}} = \frac{k_{g}}{Δ z}

(7)

Therefore, the temperature increment is expressed by:

Δ T = \frac{Δ z}{k_{g}} Δ G

(8)

The soil thermal conductivity k_g is estimated using nearby daytime and nighttime clear-sky observations: for each location, the nearest daytime and nighttime clear-sky LST are matched, and the corresponding differences in ground heat flux ΔG and LST are calculated [48], yielding

k_{g} = median (\frac{Δ z * (G^{day} - G^{night})}{T_{s}^{day} - T_{s}^{night}})

(9)

where the variables with superscript “day” or “night” stand for daytime or nighttime quantities; median(·) denotes the median over all available matching locations in that month, thereby obtaining a representative k_g for the month.

The ground heat flux increment (ΔG) is proportional to the net radiation increment (ΔR_net) [49], i.e.,

Δ G = β * Δ R_{net}

(10)

where β is the fraction of net radiation absorbed by the soil, which is an empirical function of leaf area index (LAI), i.e.,

β = 0.5 \cdot \exp (- 2.13 [0.88 - 0.78 \cdot \exp (- 0.6 \cdot LAI)])

(11)

The net radiation increment (ΔR_net) is expressed by

Δ R_{net} = (1 - α) \cdot Δ SDSR + ε_{BB} \cdot Δ SDLR - Δ SULR

(12)

where α is the blue-sky albedo, ε_BB is the broad band emissivity, and ΔSULR is the surface upward longwave radiation difference, which is expressed as follows in terms of the Stefan–Boltzmann law

Δ SULR = σ \cdot ε_{BB} \cdot [{(T_{s,hypothetical} + Δ T)}^{4} - T_{s,hypothetical}^{4}]

(13)

where σ is the Stefan–Boltzmann constant.

Following the method proposed by Liu et al. [43] for solving ΔT, Equation (8) is rearranged into the following general form:

a Δ T^{4} + b Δ T^{3} + c Δ T^{2} + d Δ T + e = 0

with

a = Δ z \cdot β \cdot σ \cdot ε_{BB} / k_{g}

,

b = 4 a T_{s,hypothetical}

,

c = 6 a T_{s,hypothetical}^{2}

,

d = 4 a T_{s,hypothetical}^{3} + 1

, and

e = - a (Δ S_{net} + ε_{BB} \cdot Δ SDLR) / (σ \cdot ε_{BB})

.

By solving the above equation, ΔT is calculated. Finally, the cloudy-sky LST is obtained using Equation (2).

3.2.3. Process of All-Sky LST Retrieval

Figure 6 displays the flowchart of all-sky LST retrieval. It consists of three parts: the clear-sky LST retrieval, the cloudy-sky estimation, and the validation. For the clear-sky LST retrieval, the improved split-window algorithm is first developed through the numerical radiative transfer modeling experiments, which are conducted by the MODTRAN fed with the SeeBor V5 database, and then the clear-sky LSTs are derived from the clear-sky data. For the cloudy-sky LST retrieval, the XGBoost model is first constructed and trained; then, the hypothetical clear-sky LSTs are estimated from the cloud-contaminated data; next, the temperature increments (ΔT) induced by the cloud radiation effect are calculated in terms of the surface energy balance theory; and finally, the cloud-sky LSTs are obtained by adding the temperature increments to the hypothetical LSTs. For the validation, the clear-sky LSTs are cross-validated against the MYD11C1 LSTs, while the cloudy-sky LSTs are validated with the CLDAS LSTs. The clear-sky LSTs and the cloudy-sky SLTs together constitute the all-sky LSTs.

4. Results and Analysis

4.1. All-Sky LST Retrieval Results

All-sky LSTs over the study area are retrieved from the MERSI-II/FY-3D data in January 2022 and July 2022 using the algorithm developed in this study. To demonstrate the results, two dates, 4 January 2022 (a winter day) and 1 July 2022 (a summer day), are selected due to moderate cloud coverage. Figure 7 shows the daytime clear-sky LSTs retrieved from the MERSI-II/FY-3D data on the two dates using the split-window algorithm. It should be noted that the results in Figure 7 are a composite of multi-orbit LSTs. Due to cloud contamination, the LSTs over most of the land surfaces are missing. On 4 January 2022, the LSTs are mainly distributed in [250, 300] K, and the LSTs in the low-latitude region are usually greater than those in the high-latitude region. On 1 July 2022 (the summer day), the LSTs primarily range between 280 K and 330 K, which are higher than those on 4 January 2022, and an opposite variation of LSTs with latitude is observed.

Besides the clear-sky LSTs, the cloudy-sky LSTs are also displayed in Figure 8, which are multi-orbit composite results too. The results show that the cloud-contaminated land areas are fully filled with the cloud-sky LSTs, and no obvious discontinuity is observed in the transition area between the clear-sky LSTs and the cloudy-sky LSTs. The seasonal and spatial distribution patterns of all-sky LSTs are much clearer than those of the clear-sky LSTs. On 4 January 2022 (the winter day), the LSTs are generally low and exhibit a clear deceasing trend from south to north, modulated by topographic relief and surface heterogeneity: the relatively high LSTs are mainly distributed in low latitudes and arid regions, whereas the relatively low LSTs are primarily concentrated in high latitudes and high elevation regions, such as the Tibetan Plateau and its surroundings, and moderate LSTs are mainly distributed over the eastern monsoon region and major plains and basins. On 4 July 2022 (the summer day), LSTs are significantly higher than those on the winter day. The relatively high LSTs are mainly distributed over arid and semi-arid regions, such as the Tarim Basin–Hexi Corridor region, western Inner Mongolia, and the southern margin of the Mongolian Plateau, while the relatively low LSTs are concentrated over the high-elevation areas, such as the Tibetan Plateau. LSTs are generally lower over the eastern humid and semi-humid areas, where vegetation is usually denser. In addition, the contrasts between plains and basins and between coastal and inland areas are also significant, which reflects regional thermal differences shaped by vegetation, moisture, and topography.

4.2. Validation of All-Sky LSTs

4.2.1. Cross-Validation of the Clear-Sky LSTs

The clear-sky LSTs retrieved from MERSI-II/FY-3D data using the generalized split-window algorithm are cross-validated with the LSTs extracted from the MYD11C1 V6 product. It should be noted that the MYD11C1 V6 product is a multi-orbit composite, and the LSTs in the overlapping areas of two neighboring orbits are not suitable for validation [19,20,21]. Therefore, the MYD11C1 LSTs in the overlapping areas are excluded from the cross-validation. To collect matching samples for cross-validation of the clear-sky LSTs in this study against the MYD11C1 LSTs, the following criteria are applied: (1) collocation in the 0.05° × 0.05° grid space, (2) the absolute observation time difference of less than 20 min, and (3) |cosθ₁/cosθ₂−1| < 0.25, where θ₁ and θ₂ are the VZAs of MERSI-II/FY-3D and MODIS/Aqua observations, respectively. According to the criteria, a total of 212,714 clear-sky matching samples are collected. Figure 9 shows the scatterplot of the matching samples and the histogram of differences between the clear-sky LSTs in this study and the MYD11C1 LSTs. The LST varies from 230.0 K to 345.0 K, and the matching samples are well distributed around the 1:1 diagonal with a coefficient of determination (R²) of 0.99. The LST difference mainly ranges between ±2.5 K, and basically obeys a normal distribution centered at about 0.0 K. Against the MYD11C1 LSTs, the RMSE and bias (mean of the differences ± standard deviation) are, respectively, 1.15 K and 0.01 ± 1.14 K, which fall within the accuracy range reported by previous studies [18,19,20,21]. Therefore, a conclusion can be made that the clear-sky LSTs in this study agree well with the MYD11C1 LSTs, and can be used for subsequent cloudy-sky LST retrieval.

4.2.2. Validation of the Cloudy-Sky LSTs

The cloudy-sky LSTs in this study are validated with the CLDAS LSTs. Besides the cloudy-sky condition, the first two criteria to collect the clear-sky matching samples are also used to collect the cloudy-sky matching samples between the cloudy LSTs in this study and the CLDAS LSTs. According to the criteria, a total of 24,896 cloudy-sky matching samples are obtained.

To demonstrate the effect of cloud radiation effect correction, the results before and after the cloud radiation effect correction are validated against the CLDAS LSTs. Figure 10 shows the validation results of hypothetical clear-sky LSTs in this study against the CLDAS LSTs. In contrast to the clear-sky results in Figure 9, the distribution of the matching samples in Figure 10a is more dispersed, and the coefficient of determination decreases to 0.91; meanwhile, the LST difference has a larger dynamic range, varying from −12.0 K to 14.0 K. Against the CLDAS LSTs, the RMSE and bias of hypothetical clear-sky LSTs in this study are 4.05 K and 0.75 ± 3.98 K, respectively. Figure 11 displays the validation results of the cloudy-sky LSTs after the cloud radiation effect correction in this study against the CLDAS LSTs. After the correction, the distribution of the cloudy-sky matching samples is more concentrated, the coefficient of determination is increased by 0.01, and the range of LST differences has slightly narrowed; meanwhile, the central frequency increases from 19.5% to 22%. Against the CLDAS LSTs, the RMSE and bias of the cloudy-sky LSTs after cloud radiation effect correction in this study are 3.69 K and 0.36 ± 3.67 K, respectively, which are reduced by about 0.4 K on average. The magnitude of the reduction achieved through the cloud effect correction is comparable to that reported in [49].

Compared with recent cloudy-sky retrieval studies, the proposed method achieved a competitive level of accuracy. Jia et al. (2021) reported RMSEs of approximately 3–4 K for cloudy-sky LST retrieval from MODIS and VIIRS observations using the SEB-based method [49]. Xu et al. (2022) reconstructed daytime all-sky LST over the Tibetan Plateau and achieved RMSEs ranging from 3.31 K to 4.06 K under cloudy-sky conditions [50]. Liu et al. (2023) obtained RMSEs of 3.71 K and 2.73 K for daytime and nighttime cloudy-sky LST retrievals, respectively, and an overall all-sky LST RMSE of 2.84 K from FY-4A/AGRI observations [43]. In comparison, the cloudy-sky LSTs retrieved in this study yield an RMSE of 3.69 K and an R² of 0.92 against the CLDAS LST product, which falls within the accuracy range reported by these state-of-the-art methods.

5. Discussion

The use of SEB correction is a key strength of the hybrid method. By incorporating physical constraints on radiative fluxes, both the bias and RMSE of retrieved LSTs were reduced, while the coefficient of determination (R²) increased. This demonstrates that combining machine learning with physically based corrections can effectively mitigate some of the limitations of purely data-driven models. In particular, the SEB theory helps account for cloud radiation effects that are difficult for statistical models to generalize, especially under varying atmospheric and surface conditions.

It should be noted that an improvement of only about 0.4 K was achieved and the error under cloudy-sky conditions remains significantly larger than that under clear skies. This highlights several limitations. First, cloud properties (e.g., optical thickness, phase, vertical structure) introduce strong and variable perturbations that are difficult to model accurately, and the impact of precipitation/rainfall on the cloudy-sky LST retrieval cannot be ignored. Second, the training data for XGBoost—likely derived from clear-sky observations or reanalysis products—may not fully represent the diversity of cloudy scenarios, leading to reduced generalization. Third, the validation against the CLDAS LST, while useful for large-scale assessment, introduces its own uncertainties because it is itself an assimilated data rather than a direct ground truth.

Another point worth discussing is the trade-off between model complexity and interpretability. The split-window algorithm is relatively transparent and physically interpretable, while the XGBoost model operates more as a “black box.” The hybrid design partially addresses this by anchoring the machine-learning output with physical correction, but further work could explore explainability techniques to better understand feature importance and model behavior under different cloud regimes.

Looking ahead, several avenues could improve the approach. Incorporating additional inputs—such as cloud optical properties from other sensors, microwave observations that penetrate clouds, or higher-resolution atmospheric profiles—could enhance the robustness of the cloudy-sky LST retrieval. Expanding validation with in situ measurements across different land cover types and climate zones would also strengthen confidence in the results, which will be conducted in the near future. Finally, testing the transferability of the method across regions, seasons and annuals would help assess its operational potential.

Overall, the study demonstrated a clear progression: high-accuracy LST retrieval under clear-sky conditions using a refined physical algorithm, and a promising hybrid strategy for extending LST estimation into cloudy-sky conditions. The results underscore the value of integrating machine learning with physical principles, while also highlighting the persistent challenges associated with cloud-contaminated observations.

6. Summary and Conclusions

This paper presented the all-sky LST retrieval from MERSI-II/FY-3D data in January 2022 and July 2022 over the study area ranging from 70°E to 130°E and from 10°N to 50°N. First, the improved split-window algorithm was developed using numerical radiative transfer simulation experiments. Then, the clear-sky LSTs were retrieved from MERSI-II/FY-3D data and cross-validated against the MYD11C1 LSTs. Next, the XGBoost model was developed, and the hypothetical clear-sky LSTs were estimated. After that, the cloud radiation effect correction method was developed in terms of the SEB theory, and the LST increments were obtained. Finally, the cloudy-sky LSTs were obtained by adding the temperature increment to hypothetical clear-sky LSTs, and validated with the CLDAS LSTs.

Against the MYD11C1 LSTs, the RMSE, bias and R² of the retrieved clear-sky LSTs in this study are 1.15 K, 0.01 ± 1.14 K, and 0.99, respectively. Against the CLDAS LSTs, the RMSE, bias and R² of the estimated hypothetical clear-sky LSTs are 4.05 K, 0.75 ± 3.98 K and 0.91, respectively, while they are 3.69 K, 0.36 ± 3.67 K, and 0.92 for the retrieved cloudy-sky LSTs, respectively. In general, the all-sky LSTs retrieved in this study are accurate and consistent with the results of previous studies. The hybrid method developed in this study can be used to generate a global long-term all-sky LST dataset.

Author Contributions

Conceptualization, G.-M.J.; methodology, G.-M.J.; software, H.-H.Z.; validation, H.-H.Z.; formal analysis, H.-H.Z.; investigation, H.-H.Z.; resources, H.-H.Z.; data curation, H.-H.Z.; writing—original draft preparation, H.-H.Z.; writing—review and editing, G.-M.J.; visualization, H.-H.Z.; supervision, G.-M.J.; project administration, G.-M.J.; funding acquisition, G.-M.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 41871222.

Data Availability Statement

Data available on request due to privacy restrictions.

Acknowledgments

The authors would like to thank the editors and the anonymous reviewers for their constructive suggestions, which helped them to improve the quality and presentation of this article. Thanks are also given to the National Meteorological Science Data Center, Beijing, China, for providing MERSI-II/FY-3D and CLDAS V2 data.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sellers, P.J.; Hall, F.G.; Asrar, G.; Strebel, D.E.; Murphy, R.E. The First ISLSCP Field Experiment (FIFE). Bull. Am. Meteorol. Soc. 1988, 69, 22–27. [Google Scholar] [CrossRef]
Shi, H.; Xian, G.Z.; Auch, R.F.; Gallo, K.; Zhou, Q. Urban heat island and its regional impacts using remotely sensed thermal data: A review of recent developments and methodology. Land 2021, 10, 867. [Google Scholar] [CrossRef]
Anderson, M.C.; Norman, J.M.; Kustas, W.P.; Houborg, R.; Starks, P.J.; Agam, N. A thermal-based remote sensing technique for routine mapping of land-surface carbon, water and energy fluxes from field to regional scales. Remote Sens. Environ. 2008, 112, 4227–4241. [Google Scholar] [CrossRef]
Liu, H.; Huang, B.; Cheng, X.; Yin, M.; Shang, C.; Luo, Y.; He, B.-J. Sensing-based park cooling performance observation and assessment: A review. Build. Environ. 2023, 245, 110915. [Google Scholar] [CrossRef]
Huang, L.; Ahmad, S.; Miao, C.; Chen, F.; Mohaghegh, L.; Cheshmehzangi, A. Uncovering the Air Quality Benefits of Urban Forests Using UAV Surveys. Urban Build. Sci. 2026, 2, 5. [Google Scholar]
Li, Z.-L.; Tang, B.-H.; Wu, H.; Ren, H.; Yan, G.; Wan, Z.; Trigo, I.F.; Sobrino, J.A. Satellite-derived land surface temperature: Current status and perspectives. Remote Sens. Environ. 2013, 131, 14–37. [Google Scholar] [CrossRef]
Wang, B.; Guo, P.; Meng, C.; Wang, Q. Retrieval and verification of land surface temperature in China based on an FY-3D microwave radiation imager. Trans. Atmos. Sci. 2022, 45, 112–123. [Google Scholar]
Li, Z.-C.; Jiang, G.-M. Sea surface temperature retrieval from the FY-3D MWRI measurements. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4201010. [Google Scholar] [CrossRef]
Minnett, P.J.; Alvera-Azcárate, A.; Chin, T.M.; Corlett, G.K.; Gentemann, C.L.; Karagali, I.; Li, X.; Marsouin, A.; Marullo, S.; Maturi, E.; et al. Half a century of satellite remote sensing of sea-surface temperature. Remote Sens. Environ. 2019, 233, 111366. [Google Scholar] [CrossRef]
Østby, T.I.; Schuler, T.V.; Westermann, S. Severe cloud contamination of MODIS land surface temperatures over an Arctic ice cap, Svalbard. Remote Sens. Environ. 2014, 142, 95–102. [Google Scholar] [CrossRef]
Qin, Z.; Karnieli, A. A mono-window algorithm for retrieving land surface temperature from Landsat TM data and its application to the Israel-Egypt border region. Int. J. Remote Sens. 2001, 22, 3719–3746. [Google Scholar] [CrossRef]
Jiménez-Muñoz, J.C.; Sobrino, J.A. A generalized single channel method for retrieving land surface temperature from remote sensing data. J. Geophys. Res. 2003, 108, 4688. [Google Scholar] [CrossRef]
Wan, Z.; Dozier, J. A generalized split-window algorithm for retrieving land-surface temperature from space. IEEE Trans. Geosci. Remote Sens. 1996, 34, 892–905. [Google Scholar]
Wan, Z. New refinements and validation of the MODIS land-surface temperature/emissivity products. Remote Sens. Environ. 2008, 112, 59–74. [Google Scholar] [CrossRef]
Sun, D.L.; Pinker, R.T. Estimation of land surface temperature from a Geostationary Operational Environmental Satellite (GOES-8). J. Geophys. Res. Atmos. 2003, 108, 4326. [Google Scholar] [CrossRef]
Mao, K.; Shi, J.; Qin, Z.; Gong, P.; Xu, B.; Jiang, L. A four-channel algorithm for simultaneous retrieval of land surface temperature and emissivity from ASTER data. J. Remote Sens. 2006, 4, 593–599. [Google Scholar]
McMillin, L.M. Estimation of sea surface temperatures from two infrared window measurements with different absorption. J. Geophys. Res. 1975, 80, 5113–5117. [Google Scholar] [CrossRef]
Jiang, G.-M.; Li, Z.-L. Split-window algorithm for land surface temperature estimation from MSG1-SEVIRI data. Int. J. Remote Sens. 2008, 29, 6067–6074. [Google Scholar] [CrossRef]
Jiang, G.-M.; Zhou, W.; Liu, R. Development of split-window algorithm for land surface temperature estimation from the VIRR/FY-3A measurements. IEEE Geosci. Remote Sens. Lett. 2013, 10, 952–956. [Google Scholar] [CrossRef]
Jiang, G.-M.; Liu, R. Retrieval of sea and land surface temperature from SVISSR/FY-2C/D/E measurements. IEEE Trans. Geosci. Remote Sens. 2014, 52, 6132–6140. [Google Scholar] [CrossRef]
Li, S.; Jiang, G.-M. Land surface temperature retrieval from Landsat-8 data with the generalized split-window algorithm. IEEE Access 2018, 6, 18149–18162. [Google Scholar] [CrossRef]
Qin, Z.; Dall’Olmo, G.; Karnieli, A.; Berliner, P. Derivation of split window algorithm and its sensitivity analysis for retrieving land surface temperature from NOAA advanced very high resolution radiometer data. J. Geophys. Res. Atmos. 2001, 106, 22655–22670. [Google Scholar] [CrossRef]
Metz, M.; Rocchini, D.; Neteler, M. Surface temperatures at the continental scale: Tracking changes with remote sensing at unprecedented detail. Remote Sens. 2014, 6, 3822–3840. [Google Scholar] [CrossRef]
Yang, G.; Sun, W.; Shen, H.; Meng, X.; Li, J. An integrated method for reconstructing daily MODIS land surface temperature data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 1026–1040. [Google Scholar] [CrossRef]
Duan, S.-B.; Li, Z.-L.; Leng, P. A framework for the retrieval of all-weather land surface temperature at a high spatial resolution from polar-orbiting thermal infrared and passive microwave data. Remote Sens. Environ. 2017, 195, 107–117. [Google Scholar]
Zhang, X.; Zhou, J.; Göttsche, F.-M.; Zhan, W.; Liu, S.; Cao, R. A method based on temporal component decomposition for estimating 1-km all-weather land surface temperature by merging satellite thermal infrared and passive microwave observations. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4670–4691. [Google Scholar]
Fu, P.; Xie, Y.; Weng, Q.; Myint, S.; Meacham-Hensold, K.; Bernacchi, C. A physical model-based method for retrieving urban land surface temperatures under cloudy conditions. Remote Sens. Environ. 2019, 230, 111191. [Google Scholar] [CrossRef]
Lu, L.; Venus, V.; Skidmore, A.; Wang, T.; Luo, G. Estimating land-surface temperature under clouds using MSG/SEVIRI observations. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 265–276. [Google Scholar] [CrossRef]
Ding, L.; Zhou, J.; Zhang, X.; Wang, S.; Tang, W.; Wang, Z.; Ma, J.; Ai, L.; Li, M.; Wang, W. Estimation of all-weather land surface temperature with remote sensing: Progress and challenges. Natl. Remote Sens. Bull. 2023, 27, 1534–1553. [Google Scholar]
Fan, X.M.; Liu, H.G.; Liu, G.H.; Li, S.B. Reconstruction of MODIS land-surface temperature in a flat terrain and fragmented landscape. Int. J. Remote Sens. 2014, 35, 7857–7877. [Google Scholar] [CrossRef]
Zhao, W.; Duan, S.-B. Reconstruction of daytime land surface temperatures under cloud-covered conditions using integrated MODIS/Terra land products and MSG geostationary satellite data. Remote Sens. Environ. 2020, 247, 111931. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Han, X.; Wang, F.; Shan, T. Research and applications of true color image composite method for Fengyun-3D. J. Mar. Meteorol. 2019, 39, 13–23. [Google Scholar]
Duan, S.-B.; Li, Z.-L.; Li, H.; Göttsche, F.-M.; Wu, H.; Zhao, W.; Leng, P.; Zhang, X.; Coll, C. Validation of Collection 6 MODIS land surface temperature product using in situ measurements. Remote Sens. Environ. 2019, 225, 16–29. [Google Scholar] [CrossRef]
Seemann, S.W.; Borbas, E.E.; Knuteson, R.O.; Stephenson, G.R.; Huang, H. Development of a global infrared land surface emissivity database for application to clear sky sounding retrievals from multispectral satellite radiance measurements. J. Appl. Meteorol. Climatol. 2008, 47, 108–123. [Google Scholar] [CrossRef]
Jiang, G.-M.; Zou, Y.; Chen, H. Assessment and correction of the on-orbit radiometric calibration in FY-3D MERSI-2 thermal infrared channels. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5003510. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Biavati, G.; Horányi, A.; Muñoz Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Rozum, I. ERA5 Hourly Data on Single Levels from 1959 to Present; Copernicus Climate Change Service (C3S), Climate Data Store (CDS): Reading, UK, 2018. [Google Scholar]
Hogan, R.J. Radiation Quantities in the ECMWF Model and MARS; ECMWF: Reading, UK, 2015. [Google Scholar]
Liang, S.; Cheng, J.; Jia, K.; Jiang, B.; Liu, Q.; Xiao, Z.; Yao, Y.; Yuan, W.; Zhang, X.; Zhao, X.; et al. The Global Land Surface Satellite (GLASS) product suite. Bull. Am. Meteorol. Soc. 2021, 102, E323–E337. [Google Scholar] [CrossRef]
Hu, J.Y.; Zhao, L.; Wang, C.; Hu, G.-J.; Zou, D.-F.; Xing, Z.-P.; Jiao, M.-D.; Qiao, Y.-P.; Liu, G.-Y.; Du, E.J. Applicability evaluation and correction of CLDAS surface temperature products in permafrost region of Qinghai-Tibet Plateau. Clim. Change Res. 2024, 20, 10–25. [Google Scholar]
Berk, A.; Bernstein, L.S.; Anderson, G.P.; Acharya, P.K.; Robertson, D.C.; Chetwynd, J.H.; Adler-Golden, S.M. MODTRAN cloud and multiple scattering upgrades with application to AVIRIS. Remote Sens. Environ. 1998, 65, 367–375. [Google Scholar] [CrossRef]
Borbas, E.E.; Seemann, S.W.; Huang, H.-L.; Li, J.; Menzel, W.P. Global profile training database for satellite regression retrievals with estimates of skin temperature and emissivity. In Proceedings of the XIV International ATOVS Study Conference, Beijing, China, 25–31 May 2005; pp. 763–770. [Google Scholar]
Liu, W.; Cheng, J.; Wang, Q. Estimating hourly all-weather land surface temperature from FY-4A/AGRI imagery using the surface energy balance theory. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5001518. [Google Scholar] [CrossRef]
Zhang, H.; Tang, B.-H.; Li, Z.-L. A practical two-step framework for all-sky land surface temperature estimation. Remote Sens. Environ. 2024, 303, 113991. [Google Scholar] [CrossRef]
O’Brien, R.M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
Jin, M. Interpolation of surface radiative temperature measured from polar orbiting satellites to a diurnal cycle: 2. Cloudy-pixel treatment. J. Geophys. Res. Atmos. 2000, 105, 4061–4076. [Google Scholar] [CrossRef]
Dickinson, R.E. The force-restore model for surface temperatures and its generalizations. J. Clim. 1988, 1, 1086–1097. [Google Scholar] [CrossRef]
Jia, A.; Liang, S.; Wang, D. Generating a 2-km, all-sky, hourly land surface temperature product from Advanced Baseline Imager data. Remote Sens. Environ. 2022, 278, 113105. [Google Scholar] [CrossRef]
Jia, A.; Ma, H.; Liang, S.; Wang, D. Cloudy-sky land surface temperature from VIIRS and MODIS satellite data using a surface energy balance-based method. Remote Sens. Environ. 2021, 263, 112566. [Google Scholar] [CrossRef]
Xu, F.; Fan, J.; Yang, C.; Liu, J.; Zhang, X. Reconstructing all-weather daytime land surface temperature based on energy balance considering the cloud radiative effect. Atmos. Res. 2022, 279, 106397. [Google Scholar] [CrossRef]

Figure 1. Study area for all-sky LST retrieval (generated from the Global Land Cover and Land Use Change data in 2020, https://glad.umd.edu/dataset/GLCLUC2020, accessed on 20 October 2025).

Figure 2. The fitting root mean square errors varying with view zenith angle for the sub-ranges of TPW and ε.

Figure 3. The fitting root mean square errors varying with view zenith angle for the sub-ranges of LST, TPW and ε. (a) LST ≤ 282.5 K, (b) LST ∈ [277.5, 297.5] K, (c) LST ∈ [292.5, 312.5] K, and (d) LST ≥ 307.5 K.

Figure 4. Importance analysis of the XGBoost model input features.

Figure 5. Scatterplot of the hypothetical clear-sky LSTs estimated by the XGBoost model versus the clear-sky LSTs retrieved from MERSI-II/FY-3D data.

Figure 6. Flowchart of the all-sky land surface temperature retrieval.

Figure 7. The daytime clear-sky land surface temperature retrieved from MERSI-II/FY-3D data using the split-window algorithm on (a) 4 January 2022 and (b) 1 July 2022.

Figure 8. The daytime all-sky land surface temperature retrieved from MERSI-II/FY-3D data on (a) 4 January 2022 and (b) 1 July 2022.

Figure 9. Cross-validation of the clear-sky LST retrieved from MERSI-II/FY-3D data (MERSI-II LST) against the LST extracted from the MYD11C1 product (MYD11C1 LST). (a) Scatterplot of the matching samples, and (b) histogram of the differences between the MERSI-II LSTs and MYD11C1 LSTs.

Figure 10. Validation of the hypothetical clear-sky LST before cloud radiation effect correction against the LST extracted from the CLDAS V2 product (CLDAS LST). (a) Scatterplot of the CLDAS LST and the hypothetical clear-sky LST, and (b) histogram of the differences between the hypothetical clear-sky LST and CLDAS LST.

Figure 11. Validation of the cloudy-sky LSTs after cloud radiation effect correction against the LST extracted from the CLDAS V2 product (CLDAS LST). (a) Scatterplot of the CLDAS LST versus the cloudy-sky LST retrieved from MERSI-II/FY-3D data (MERSI-II LST), and (b) histogram of the differences between the cloudy-sky MERSI-II LST and CLDAS LST.

Table 1. Channel specification of MERSI-II/FY-3D.

Channel No.	Central Wavelength (nm)	Channel Width (nm)	SNR/NEΔT ¹	Nadir FOV ² (m)	Dynamic Range
1	470	50	100	250	0–90%
2	550	50	100	250	0–90%
3	650	50	100	250	0–90%
4	865	50	100	250	0–90%
5	1380	20/30	100	1000	0–90%
6	1640	50	200	1000	0–90%
7	2130	50	100	1000	0–90%
8	412	20	300	1000	0–30%
9	443	20	300	1000	0–30%
10	490	20	300	1000	0–30%
11	555	20	500	1000	0–30%
12	670	20	500	1000	0–30%
13	709	20	500	1000	0–30%
14	746	20	500	1000	0–30%
15	865	20	500	1000	0–30%
16	905	20	200	1000	0–100%
17	936	20	100	1000	0–100%
18	940	50	200	1000	0–100%
19	1030	20	100	1000	0–100%
20	3800	180	0.25 K	1000	200~350 K
21	4050	155	0.25 K	1000	200~380 K
22	7200	500	0.30 K	1000	180~280 K
23	8550	300	0.25 K	1000	180~300 K
24	10,800	1000	0.4 K	250	180~330 K
25	12,000	1000	0.4 K	250	180~330 K

¹ SNR and NEΔT denote the signal to noise ratio and the noise equivalent temperature difference at 270 K, respectively. ² FOV stands for the field of view.

Table 2. Percentages of clear-sky and cloudy-sky grids of land surfaces and mean LSTs across the three latitude ranges in January and July 2022.

Month	Latitude Range (°N)	Clear-Sky Grids (%)	Cloudy-Sky Grids (%)	Mean LST (K)
January	[40.0, 50.0]	41.3	58.7	258.59
	[25.0, 40.0)	31.4	68.6	268.21
	[10.0, 25.0)	43.9	56.1	293.57
July	[40.0, 50.0]	38.2	61.8	296.29
	[25.0, 40.0)	25.3	74.7	294.30
	[10.0, 25.0)	3.5%	96.5	300.39

Table 3. Values of variance inflation factor for quantifying multicollinearity of the candidate prediction features.

Feature	T_a	SDSR^clr	SDLR^clr	TPW	t	NDVI	α	Lon	Lat	P_s	h	SM
Value	7.13	2.45	7.66	4.17	4.40	5.06	2.09	4.60	8.54	47.80	47.78	2.17

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, H.-H.; Jiang, G.-M. Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data. Remote Sens. 2026, 18, 1954. https://doi.org/10.3390/rs18121954

AMA Style

Zhang H-H, Jiang G-M. Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data. Remote Sensing. 2026; 18(12):1954. https://doi.org/10.3390/rs18121954

Chicago/Turabian Style

Zhang, Han-Hao, and Geng-Ming Jiang. 2026. "Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data" Remote Sensing 18, no. 12: 1954. https://doi.org/10.3390/rs18121954

APA Style

Zhang, H.-H., & Jiang, G.-M. (2026). Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data. Remote Sensing, 18(12), 1954. https://doi.org/10.3390/rs18121954

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Retrieval of All-Sky Land Surface Temperature from MERSI-II/FY-3D Data

Highlights

Abstract

1. Introduction

2. Study Area, Data and Data Processing

2.1. Study Area and Time Periods

2.2. Data Description and Processing

3. Methods

3.1. Development of the Split-Window Algorithm

3.2. Development of Cloudy-Sky LST Estimation Method

3.2.1. Hypothetical Clear-Sky LST Estimation Method

3.2.2. Cloud Radiation Effect Correction Method

3.2.3. Process of All-Sky LST Retrieval

4. Results and Analysis

4.1. All-Sky LST Retrieval Results

4.2. Validation of All-Sky LSTs

4.2.1. Cross-Validation of the Clear-Sky LSTs

4.2.2. Validation of the Cloudy-Sky LSTs

5. Discussion

6. Summary and Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI