Evaluation of MODIS Land Surface Temperature Data to Estimate Near-Surface Air Temperature in Northeast China

Air temperature (Tair) near the ground surface is a fundamental descriptor of terrestrial environment conditions and one of the most widely used climatic variables in global change studies. The main objective of this study was to explore the possibility of retrieving high-resolution Tair from the Moderate Resolution Imaging Spectroradiometer (MODIS) land surface temperature (LST) products, covering complex terrain in Northeast China. The All Subsets Regression (ASR) method was adopted to select the predictors and build optimal multiple linear regression models for estimating maximum (Tmax), minimum (Tmin), and mean (Tmean) air temperatures. The relative importance of predictors in these models was evaluated via the Standardized Regression Coefficients (SRCs) method. The results indicated that the optimal models could estimate the Tmax, Tmin, and Tmean with relatively high accuracies (Model Efficiency≥ 0.90). Both LST and day length (DL) predictors were important in estimating Tmax (SRCs: daytime LST = 0.53, DL = 0.35), Tmin (SRCs: nighttime LST = 0.74, DL = 0.23), and Tmean (SRCs: nighttime LST = 0.72, DL = 0.28). Models predicting Tmin and Tmean had better performance than the one predicting Tmax. Nighttime LST was better at predicting Tmin and Tmean than daytime LST data at predicting Tmax. Land covers had noticeable influences on estimating Tair, and even seasonal vegetation greening could result in temporal variations of model performance. Air temperature could be accurately estimated using remote sensing, but the model performance was varied across different spatial and temporal scales. More predictors should be incorporated for the purpose of improving the estimation of near surface Tair from the MODIS LST production.


Introduction
Air temperature (T air ) near the ground surface is a fundamental descriptor of terrestrial environment conditions [1,2] and one of the most widely used climatic variables in global change studies.It plays an important role in multiple biological and physical processes among the hydrosphere, atmosphere, and biosphere [3][4][5].Accurate estimation of T air and mapping its spatial distribution are useful for predicting ecological consequences of climate change.For example, climate warming will lead to higher temperatures and an increase of extreme weather events, which are associated with changes in wildfire regime [6][7][8], forest biomass distribution [9] and crop yield [10,11].The demand for accurate spatial T air data over large scale has continued to rise [12,13].
Air temperature is often measured in thermometer shelters 1.5 m-2 m above the ground at meteorological stations, and is a commonly recorded form of meteorological observation data with Remote Sens. 2017, 9, 410 2 of 19 high accuracy and temporal resolution [5,14,15].However, these recorded data are limited by the sparse distribution of meteorological stations.Hence, the spatial resolution of recorded air temperature data is coarse [1,3,16].Various interpolation methods such as global interpolators, kriging, and inverse density weighting (IDW) are often carried out in order to compensate for this shortcoming [15,17].These methods have been proven to be successful in estimating temperature near meteorological stations, but it could lead to considerable uncertainties caused by inhomogeneous distribution of meteorological stations, especially in the regions with complicated topography [4,18,19].
Satellite remote sensing can provide high spatio-temporal resolution data of land surface temperature (LST) at a global coverage [16,20].It might provide a feasible way to improve the spatiotemporal accuracy of estimating T air , because it is spatially contiguous and available on a regular basis, especially in regions where ground observations are too sparse to support reliable spatial interpolation [5,21].Land surface temperature is defined as the "skin" temperature of the ground, which was detected by satellites via looking through the atmosphere to the ground [15,17].The LST is not equivalent to T air , and their relationship is complex from a theoretical and empirical perspective [22][23][24].Most studies about estimating T air using remote sensing indexes are based on simple linear regression models.For example, Tao et al. [25] showed that the correlation between nighttime LST and daily minimum T air (R 2 = 0.93) is significantly better than that between daytime LST and maximum T air (R 2 = 0.79).During daytime when there is direct solar illumination, the difference between LST and maximum T air is mainly determined by the ground surface energy balance, which is a complex system depending on many additional factors such as solar radiation, cloud-cover, soil moisture and surface roughness [15,17].Some effective measures were introduced to improve the precision of estimating daily maximum T air , for example, the temperature-vegetation index (TVX) method was used in Zhu et al. [16].This method is based on the assumption that the radiometric temperature of a fully developed vegetation canopy is in equilibrium with ambient T air [3,15].However, the performance of the TVX method mainly relies on the negative correlation between LST and the normalized difference vegetation index (NDVI), and NDVI varies greatly with different satellite sensor and land surface conditions [15,26].In other studies, Benali et al. [5] used an optimization procedure with mixed bootstrap and jackknife resample to estimate maximum air temperature (T max ), minimum air temperature (T min ) and mean air temperature (T mean ) based on LST and auxiliary data.However, previous studies often either lack spatial and/or temporal predictions, present relatively low predictive power or do not consider the model's uncertainty.They also do not take into account the relative importance of the controls, such as latitude, longitude, elevation, Julian day, NDVI and so on, which might influence the relationship between LST and T air .
Northeast China is a special geographical and important economic region.From southeast to northwest in this region, humid, semi-humid, and semi-arid zones distribute in turn with decreasing annual precipitation.The land covers in this region are diverse with natural forests, grassland, farmland, and urban area.Forests in Northeast China occupy about 30% of the total carbon storage and provide approximately 30% of the total timber production in China [27,28].Boreal larch forests in this region are also fire prone, characterized by frequent surface fires, mixed with infrequent stand-replacing crown fires, with an estimated historical mean fire return interval of 30-120 years [29,30].Regular spatial resolution of T air can be used to predict possible changes in forest wildfire regimes and plant phenology [1], and is of great use for quantifying ecosystem processes and functions such as forest NPP and crop yield [31,32].
The main objective of our study was to systematically explore the relationship between T air and Moderate Resolution Imaging Spectroradiometer (MODIS) LST over Northeast China.In order to estimate the eight-day T max , T min , and T mean , a statistical approach was implemented based on MODIS LST and auxiliary data.The specific objectives of this study were to (1) evaluate relative importance of the driving factors influencing the relationship between LST and eight-day T air , (2) develop optimal models to predict T air covering a large area that encompasses a broad variation of physiognomy and land cover, and (3) estimate the optimal models' performance in temporal (i.e., month and season) and spatial (i.e., land cover) scales.

Study Area
The study area includes Heilongjiang, Jilin, Liaoning provinces and the eastern Inner Mongolia Autonomous Region, which are located in Northeastern China (38  20 E-135 • 05 E), and encompasses about 1.47 × 10 6 km 2 .The altitude ranges from 0 m to 2691 m.Geographically, this region is characterized by plains separated by four major mountain ranges (Changbai Mountains, Small Xing'an Mountains, Great Xing'an Mountains and Yanshan Mountains), and the majority of croplands and forests are distributed in these plain and mountain regions respectively (Figure 1).The climate is controlled by the high latitude East Asia monsoon climate and includes warm temperate, intermediate, and cool temperate zones distributed from south to north along latitude, and humid, semi-humid and semiarid zones from east to west.Meteorological station data sets  show that annual average temperature varies between −4.0 • C and 11.3 • C, with a strong spatial gradient pattern.The monthly mean temperature ranges from the minimum −28.9 • C in January to the maximum 25.3 • C in July.Annual average rainfall ranges from 246.1 mm to 1095.0 mm, and the precipitation from May to September accounts for 80% of the total.The vegetation in this area is dominated by grass, conifer forest, mixed, and deciduous broadleaf forest.The growing season starts in May and the vegetation reaches its maximum primary productivity in July or August, and then begins to grow slowly thereafter.

Study Area
The study area includes Heilongjiang, Jilin, Liaoning provinces and the eastern Inner Mongolia Autonomous Region, which are located in Northeastern China (38°45′N-53°34′N, 115°20′E-135°05′E), and encompasses about 1.47 × 10 6 km 2 .The altitude ranges from 0 m to 2691 m.Geographically, this region is characterized by plains separated by four major mountain ranges (Changbai Mountains, Small Xing'an Mountains, Great Xing'an Mountains and Yanshan Mountains), and the majority of croplands and forests are distributed in these plain and mountain regions respectively (Figure 1).The climate is controlled by the high latitude East Asia monsoon climate and includes warm temperate, intermediate, and cool temperate zones distributed from south to north along latitude, and humid, semi-humid and semiarid zones from east to west.Meteorological station data sets  show that annual average temperature varies between −4.0 °C and 11.3 °C, with a strong spatial gradient pattern.The monthly mean temperature ranges from the minimum −28.9 °C in January to the maximum 25.3 °C in July.Annual average rainfall ranges from 246.1 mm to 1095.0 mm, and the precipitation from May to September accounts for 80% of the total.The vegetation in this area is dominated by grass, conifer forest, mixed, and deciduous broadleaf forest.The growing season starts in May and the vegetation reaches its maximum primary productivity in July or August, and then begins to grow slowly thereafter.

MODIS Data
The LST product from MODIS has been used in previous studies to derive Tair [5,15,16].MODIS sensors were launched on board the National Aerodynamics and Space Administration (NASA) Earth Observing System (EOS) Terra and Aqua satellites in December 1999 and May 2002,

MODIS Data
The LST product from MODIS has been used in previous studies to derive T air [5,15,16].MODIS sensors were launched on board the National Aerodynamics and Space Administration (NASA) Earth Observing System (EOS) Terra and Aqua satellites in December 1999 and May 2002, respectively [16].Both sensors are aboard sun-synchronous polar orbiting satellites.MODIS Terra data is available during 10:30-12:00 a.m. and p.m. (daytime/nighttime) local time, while MODIS Aqua sensor collects the imagery during 1:00-3:00 a.m. and p.m. (daytime/nighttime).The Aqua LST images were chosen instead of the scenes from the Terra platform for this study, because their acquisition time is closer to the actual occurrence of observed maximum and minimum temperatures [5,33,34].Two MODIS land products (version 5) were chosen in this study: Aqua eight-day LST and emissivity (MYD11A2) and reflectance of bands 1-7 (MYD09A1) covered the entire study period from July 2002 to July 2016.Here, the eight-day LST, view zenith angle, and view time from MYD11A2 and reflectance of band 1 and band 2 from MYD09A1 were used in the following analysis.These data are available from the Level-1 and Atmosphere Archive and Distribution System (LAADS) Distributed Active Archive Center (DAAC) at the Goddard Space Flight Center (GSFC) (https://ladsweb.nascom.nasa.gov/search/).The eight-day LST product (MYD11A2) at 1 km resolution is the averaged LSTs of the MYD11A1 over eight days, and LST products (MYD11A1) are retrieved from the MODIS thermal and mid-infrared spectral regions using the generalized split window algorithm [35].MYD09A1 provides MODIS bands 1-7 surface reflectance at 500 m resolution.Each pixel contains the best observation during an eight-day period selected on the basis of high observation coverage, low view zenith angle, absence of clouds or cloud shadow, and aerosol loading [36].In this paper, MYD09A1 was used to calculate eight-day NDVI.The land cover extracted from the MODIS Land Cover Type Yearly L3 Global 500 m SIN Grid (MCD12Q1, 2000, download from https://ladsweb.nascom.nasa.gov/search/)were converted to 1 km resolution using a nearest neighbor resample type to match the resolution of MODIS LST products.The land cover of each meteorological station was extracted in terms of its position, and reclassified into urban, farmland, forest, and grassland.

Vegetation Index
Normalized difference vegetation index is the most common remote sensing index used to parameterize vegetation status [16,37,38].The absorption and reflectivity of the vegetation cover are correlated with their structural properties, such as leaf area index (LAI), fractional vegetation cover (FVC), and their physiological condition [39,40].The values of NDVI vary between −1 and 1, where the range between 0.2 and 0.9 is mostly common in continuous vegetation cover [39].The NDVI is defined as: where NIR is near-infrared band, RED is red band.In MODIS product MYD09A1, band 1 is the red band and band 2 is the near-infrared band.
Applying Equation (1) to MYD09A1, eight days NDVI at 500 m resolution was produced, and then resampled to 1 km resolution.

Meteorological Data
Air temperature observations were obtained from 123 meteorological ground stations in the study area (Figure 1).The measurements included daily T min , T max , and T mean .The time period selected for our study ranged from July 2002 to July 2016 according to the availability of meteorological station record and MODIS Aqua data.The meteorological station records were obtained from the China Meteorological Data Sharing Service System (http://cdc.cma.gov.cn/home.do).This data was aggregated to eight-day periods using the same compositing method applied to the daily remote sensing LST data (i.e., by averaging the daily valid observations in the eight days period).

Auxiliary Data
In addition to MODIS products (LST, view time, view zenith angle, and the number of clear days) and NDVI, some auxiliary variables were used, including latitude, longitude, elevation, distance to the coast, Julian day, and day length (in hours) (Table 1).These auxiliary variables either have a known impact on T air and LST or influence the relationship between T air and LST.Latitude and longitude were derived from the location file of meteorological stations.Elevation was obtained from a 90-m resolution digital elevation model (DEM) (downloaded from http://www.glcf.umd.edu/data/srtm/), and then converted to be the same resolution and projection as the LST dataset on the basis of the nearest-neighbor method.This method is considered to be efficient and is widely used as a resampling method in related research [41].Distances to the coast (based on the Global Self-Consistent Hierarchical High-Resolution Shoreline (GSHHS), download from http://www.soest.hawaii.edu/pwessel/gshhs/index.html)were calculated in the ESRI ArcGIS 9.3.Both Julian day and day length are proxies for the fraction of solar energy absorption during the day and emission during the night, influencing the diurnal amplitude of T air throughout the year.Julian day is the continuous count of days from 1 January every year.The day length was calculated [42]: where DayLen is the day length in hours, δ is the solar declination in radians, Ø is the geographic latitude in radians, and ω is the angular velocity of the earth's rotation (0.2618 rad h −1 ).The value δ is defined as follows: the value d n is the day number of the year (Julian day).The low quality pixels in LST image were removed according to quality assessment (QA) layer, then outlier pixels were eliminated using the application of several histogram and temperature gradient-based outlier filters [43,44].All the MODIS data, including LST, view zenith angle, view time, clear days, and NDVI, were extracted from MYD11A2 and MYD09A1 products for the location of the meteorological stations in ESRI ArcGIS.The auxiliary data sets were extracted using the same method.All these data were combined to create a single dataset for each eight-day period and each meteorological station.The combining criteria between meteorological and satellite-retrieved data for the same date are based on a one-to-one relationship, relating eight-day T mean , T max , and T min from a station with eight-day satellite-retrieved data (daytime LST, nighttime LST, view zenith angle, view time, clear days, and NDVI) and the auxiliary data (elevation, distance to the coast, Julian day, and day length).Finally, all the records with null values were removed, and the numbers of valid records for daytime LST, nighttime LST, and LST with both day and night are 56,507, 65,164 and 54,691 respectively.The collinearity of independent variables was detected using variance inflaction factor (VIF > 10) and pair wise correlation (r > 0.7) [45,46].

Predictor Selection and Model Building
All Subsets Regression (ASR) and Standardized Regression Coefficients (SRCs) methods were adopted to select the predictors, and then used to build different multiple linear regression models.The ASR method could overcome the limitation of the stepwise regression by inspecting every possible model when all independent variables are taken into account.The best model was selected in terms of the adjusted R 2 from the possible models.The SRCs method could assess the relative importance of predictors, which describe expected change in the response variable holding the other predictors constant.We used the combination of ASR and SRCs to select the final predictors with strong interpretation and prediction power.

Model Calibration and Validation
Cross-validation was used to evaluate the generalizability of a model for predicting air temperature with the LST data.The observations of near surface air temperature were randomly divided into two parts.70% of observations were use for model calibration, and the rest were used as test dataset for model validation.A set of statistic measures were calculated to assess the accuracy of the predicted air temperature, including root mean square error (RMSE), model efficiency (MEF), coefficient of residual mass (CRM), mean bias error (MBE), and mean absolute error (MAE).The equations of the statistic measures followed as: where n is the number of records in validation datasets used in this study, ŷi is the estimated variable, and y i is the observed variable.
The RMSE is an indicator showing the bias in the mean and in the spatial variance.The value of MEF ranges from −1.0 to 1.0, and would be lower if the estimation method had a poor performance.The CRM indicates overall under-or overestimation of the estimation method, with the value approaching zero for perfect estimation.A positive value of CRM indicates the tendency to underestimate the observed values, and vice versa.The MBE shows the direction of the error bias with zero indicating unbiased estimation of the model.The MAE shows the error magnitude, and lower values mean better performance [41,47].
The uncertainty of predicted air temperature in spatial and temporal scales was also calculated.The pairs of observations and predictions were compared in each meteorological station.Model performance was compared across different land cover (static variable) and Julian day (dynamics variable).Moreover, the impact of seasonality on model accuracy was also evaluated.
MODIS data processing was implemented in MODIS Reprojection Tool (MRT, https://lpdaac.usgs.gov/tools/MODIS_reprojection_tool) and software R using the "sp", "raster" and "rgdal" packages.All the statistical analyses were performed in R using the basic function and "leap" package (R Development Core Team 2015).

Variable and Model Selection
The adjusted R 2 of T max models ranged from 0.72 to 0.90 (Figure 2a, starting at the bottom).The models with the following six independent variables were considered optimal with the adjusted R 2 fixed at 0.90.Further increasing independent variables didn't enhance the adjusted R 2 .The day LST, day length, Julian day, NDVI, the number of clear days, and latitude were all determined as important independent variables with SRCs at 0.53, 0.35, 0.11, 0.11, 0.08, and −0.05, respectively (Figure 3a).The candidate models involving these six independent variables (model 6 to model 7, Table 2) were constructed to predict T max .In model selection of T min , three independent variables could elevate the adjusted R 2 to 0.94 (Figure 2b).The night LST whose SRC was 0.74 could predict T min well.The SRCs of the other two predictors (day length and elevation) were 0.23 and 0.05, respectively (Figure 3b).Models 9, 10, and 11 were the candidate models to predict T min (Table 2).The combination of nighttime LST and day length could explain 94% of variation in T mean , and the addition of other predictors did not increase the adjusted R 2 (Figure 2c).The nighttime LST, day length, the number of clear days, and the number of clear nights were important independent variables, whose SRCs were 0.72, 0.28, 0.03, and 0.03, respectively (Figure 3c).The candidate models included models 13-15.with zero indicating unbiased estimation of the model.The MAE shows the error magnitude, and lower values mean better performance [41,47].
The uncertainty of predicted air temperature in spatial and temporal scales was also calculated.The pairs of observations and predictions were compared in each meteorological station.Model performance was compared across different land cover (static variable) and Julian day (dynamics variable).Moreover, the impact of seasonality on model accuracy was also evaluated.
MODIS data processing was implemented in MODIS Reprojection Tool (MRT, https://lpdaac.usgs.gov/tools/MODIS_reprojection_tool)and software R using the "sp", "raster" and "rgdal" packages.All the statistical analyses were performed in R using the basic function and "leap" package (R Development Core Team 2015).

Variable and Model Selection
The adjusted R 2 of Tmax models ranged from 0.72 to 0.90 (Figure 2a, starting at the bottom).The models with the following six independent variables were considered optimal with the adjusted R 2 fixed at 0.90.Further increasing independent variables didn't enhance the adjusted R 2 .The day LST, day length, Julian day, NDVI, the number of clear days, and latitude were all determined as important independent variables with SRCs at 0.53, 0.35, 0.11, 0.11, 0.08, and −0.05, respectively (Figure 3a).The candidate models involving these six independent variables (model 6 to model 7, Table 2) were constructed to predict Tmax.In model selection of Tmin, three independent variables could elevate the adjusted R 2 to 0.94 (Figure 2b).The night LST whose SRC was 0.74 could predict Tmin well.The SRCs of the other two predictors (day length and elevation) were 0.23 and 0.05, respectively (Figure 3b).Models 9, 10, and 11 were the candidate models to predict Tmin (Table 2).The combination of nighttime LST and day length could explain 94% of variation in Tmean, and the addition of other predictors did not increase the adjusted R 2 (Figure 2c).The nighttime LST, day length, the number of clear days, and the number of clear nights were important independent variables, whose SRCs were 0.72, 0.28, 0.03, and 0.03, respectively (Figure 3c).The candidate models included models 13-15.

Model Validation
Among the candidate models of Tmax, model 1 (Table 2) that included only one independent variable (daytime LST) explained 86.0% of the variability in Tmax, while the addition of other predictors in models 2 to 7 explained 3-4% more.Among these seven models, models 6 and 7 had lowest RMSE.All Tmax models' CRM and MBE were positive values, indicating models overestimated the Tmax.Models 6 and 7 included the same number of predictors (Table 2), Model 7 was better at fitting and predicting according to the set of statistic measures, and the latitude predictor in Model 7 was more effective than elevation in Model 6 (Figure 3a).2), the SRC of elevation is larger than latitude (Figure 3b).Comparing the statistic measures of Tmean models, model 15 was optimum for predicting Tmean.The RMSE of model 15 was 3.60 °C, which was smaller than other Tmean models.The other statistic measures (i.e., MBE = −0.29°C,MAE = 2.80 °C) also illustrated that model 15 was the best (Table 2).
Model 7, model 11, and model 15 which used auxiliary data (e.g., day length, Julian day, and NDVI) were better than model 1, model 8, and model 12, which merely involved daytime or nighttime LST data in predicting Tmax, Tmin, and Tmean respectively.The residuals of these optimal models approximated normality.Autocorrelation of the residuals were found not significant as the auto-correlation function (ACF) was smaller than 0.2 for all lags except at zero.The distribution of relative bias (observed Tair-predicted Tair) was closely centralized to zero and analogous to normal distribution (Figure 4).For all stations, the scatter plots between observed and predicted Tair distributed agreeably along the diagonal line, indicating the accuracy of model fitting (Figure 5).However, the effectiveness of optimum models was relatively poor under extreme low and high temperature conditions.The performance of models for Tmin and Tmean was better than that of Tmax.

Spatial and Temporal Performance
The optimal models were analyzed in each meteorological station to assess their effectiveness.The results showed that for the Tmax model about 80% of the stations had an RMSE lower than 5 °C (Figure 6a), MAE lower than 4 °C, and MEF higher than 0.85 (Figure 6b).The distribution of MEF was clearly skewed toward higher values.The stations' RMSE and MEF ranged from 3.41 °C to 5.94 °C, and 0.69 to 0.93 respectively, which were very similar to the overall results (RMSE = 4.63 °C and MFE = 0.89).The Tmax model MBE in most stations ranged from −1.5 °C to 1.5 °C, gathering around zero.These statistic measures illustrated that model 7 was able to predict Tmax accurately.The spatial performance of optimal models predicting Tmin (model 11) and Tmean (model 15) was better than model 7, because the distribution of their RMSE (Figure 6e,i) and MAE (Figure 6h,l) tended to be even, and MEF more skewed to 1 (Figure 6f,j).

Model Validation
Among the candidate models of T max , model 1 (Table 2) that included only one independent variable (daytime LST) explained 86.0% of the variability in T max , while the addition of other predictors in models 2 to 7 explained 3-4% more.Among these seven models, models 6 and 7 had lowest RMSE.All T max models' CRM and MBE were positive values, indicating models overestimated the T max .Models 6 and 7 included the same number of predictors (Table 2), Model 7 was better at fitting and predicting according to the set of statistic measures, and the latitude predictor in Model 7 was more effective than elevation in Model 6 (Figure 3a).2), the SRC of elevation is larger than latitude (Figure 3b).Comparing the statistic measures of T mean models, model 15 was optimum for predicting T mean .The RMSE of model 15 was 3.60 • C, which was smaller than other T mean models.The other statistic measures (i.e., MBE = −0.29 • C, MAE = 2.80 • C) also illustrated that model 15 was the best (Table 2).
Model 7, model 11, and model 15 which used auxiliary data (e.g., day length, Julian day, and NDVI) were better than model 1, model 8, and model 12, which merely involved daytime or nighttime LST data in predicting T max , T min , and T mean respectively.The residuals of these optimal models approximated normality.Autocorrelation of the residuals were found not significant as the auto-correlation function (ACF) was smaller than 0.2 for all lags except at zero.The distribution of relative bias (observed T air -predicted T air ) was closely centralized to zero and analogous to normal distribution (Figure 4).For all stations, the scatter plots between observed and predicted T air distributed agreeably along the diagonal line, indicating the accuracy of model fitting (Figure 5).However, the effectiveness of optimum models was relatively poor under extreme low and high temperature conditions.The performance of models for T min and T mean was better than that of T max .

Spatial and Temporal Performance
The optimal models were analyzed in each meteorological station to assess their effectiveness.The results showed that for the T max model about 80% of the stations had an RMSE lower than 5 • C (Figure 6a), MAE lower than 4 • C, and MEF higher than 0.85 (Figure 6b).The distribution of MEF was clearly skewed toward higher values.The stations' RMSE and MEF ranged from 3.41 • C to 5.94 • C, and 0.69 to 0.93 respectively, which were very similar to the overall results (RMSE = 4.63 • C and MFE = 0.89).The T max model MBE in most stations ranged from −1.5 • C to 1.5 • C, gathering around zero.These statistic measures illustrated that model 7 was able to predict T max accurately.The spatial performance of optimal models predicting T min (model 11) and T mean (model 15) was better than model 7, because the distribution of their RMSE (Figure 6e,i) and MAE (Figure 6h,l) tended to be even, and MEF more skewed to 1 (Figure 6f,j).The spatial distribution of RMSE (model 7, model 11, and model 15) showed that the models generally performed worse for the stations located in the mountain and plateau areas than those in the plains (Figure 7a,c,e).The spatial distribution of MFE was similar, which performed well even in plain areas (Figure 7b,d,f).The spatial distribution of RMSE (model 7, model 11, and model 15) showed that the models generally performed worse for the stations located in the mountain and plateau areas than those in the plains (Figure 7a,c,e).The spatial distribution of MFE was similar, which performed well even in plain areas (Figure 7b,d,f).The spatial distribution of RMSE (model 7, model 11, and model 15) showed that the models generally performed worse for the stations located in the mountain and plateau areas than those in the plains (Figure 7a,c,e).The spatial distribution of MFE was similar, which performed well even in plain areas (Figure 7b,d,f).The models performed well for stations located in urban areas and farmland, and performed poorly in forest and grassland areas (Table 3).The RMSEs of T max model 7 in urban, farmland, forest and grassland areas were The temporal distributions of model performance were evaluated at season, month, and 8-day scales.A set of statistic measures were calculated for all predicted/observed pairs within a specific season (Table 4), month (Table 5), and 8-day (Figure 8) periods, comprising all the years.Model 7 for T max , model 11 for T min , and model 15 for T mean performed well at predicting air temperature in summer according to RMSE (3.82 • C, 2.71 • C, and 2.70 • C, respectively), while they underperformed in terms of MFE (−0.04, 0.62, and 0.45, respectively).These showed that uncertainty existed in summer.Based on the synthesis of all statistic measures, all the models performed better in spring and worse in winter.The model 11 for T min and model 15 for T mean performed better than model 7 for T max , which was similar to the model performance at the spatial scale.Model performance at the monthly scale was similar to that at the seasonal scale.At the eight-day period scale, data availability was poor in winter and late autumn, during which the RMSE and MAE of model 7 for T max were high (Figure 8a).An similar pattern was also found in model 15 for T mean , and poor data availability was accompanied by higher RMSE and MAE (Figure 8c).Although the data availability used to predict T min was high in winter and late autumn, the performance of model 11 was poor with high RMSE and MAE (Figure 8b).The models performed well for stations located in urban areas and farmland, and performed poorly in forest and grassland areas (Table 3).The RMSEs of Tmax model 7 in urban, farmland, forest and grassland areas were 4.62 °C, 4.55 °C, 4.77 °C, and 5.09 °C, and MFEs were 0.89, 0.89, 0.84, and 0.88, respectively.For Tmin model 11, the RMSEs in urban, farmland, forest, and grassland areas were 3.97 °C, 3.82 °C, 4.81 °C, and 4.33 °C, respectively, and MFEs were 0.93, 0.93, 0.90, and 0.92.The RMSEs for Tmean model 15 in urban, farmland, forest, and grassland areas were 3.63 °C, 3.53 °C, 3.76 °C, and 3.77 °C, respectively, and MFEs were 0.93, 0.93, 0.90, and 0.93.Model 11 and model 15, predicting Tmin and Tmean, were better than model 7 at predicting Tmax across all land cover types.
The temporal distributions of model performance were evaluated at season, month, and 8-day scales.A set of statistic measures were calculated for all predicted/observed pairs within a specific season (Table 4), month (Table 5), and 8-day (Figure 8) periods, comprising all the years.Model 7 for Tmax, model 11 for Tmin, and model 15 for Tmean performed well at predicting air temperature in summer according to RMSE (3.82 °C, 2.71 °C, and 2.70 °C, respectively), while they underperformed in terms of MFE (−0.04, 0.62, and 0.45, respectively).These showed that uncertainty existed in summer.Based on the synthesis of all statistic measures, all the models performed better in spring and worse in winter.The model 11 for Tmin and model 15 for Tmean performed better than model 7 for Tmax, which was similar to the model performance at the spatial scale.Model performance at the monthly scale was similar to that at the seasonal scale.At the eight-day period scale, data availability was poor in winter and late autumn, during which the RMSE and MAE of model 7 for Tmax were high (Figure 8a).An similar pattern was also found in model 15 for Tmean, and poor data availability was accompanied by higher RMSE and MAE (Figure 8c).Although the data availability used to predict Tmin was high in winter and late autumn, the performance of model 11 was poor with high RMSE and MAE (Figure 8b).

Patterns of Estimated Annual Air Temperature
The annual spatial T air pattern was consistent with prior knowledge of this region solely derived from the meteorological stations data.Predicted annual T max , T min and T mean exhibited marked spatial gradients decreasing from south to north (Figure 9).The spatial gradients of predicted T air varied not only along the latitude, but also with the longitude.Estimated annual T air presented obvious patterns with higher temperature in the center of this region, and lower in both east and west sides.Meanwhile, the patterns of estimated T air were influenced by the distance to the coastline.Air temperature was relatively lower in the west of this region further away from the coastline than the east, and the lowest air temperature was found in the northwest.

Patterns of Estimated Annual Air Temperature
The annual spatial Tair pattern was consistent with prior knowledge of this region solely derived from the meteorological stations data.Predicted annual Tmax, Tmin and Tmean exhibited marked spatial gradients decreasing from south to north (Figure 9).The spatial gradients of predicted Tair varied not only along the latitude, but also with the longitude.Estimated annual Tair presented obvious patterns with higher temperature in the center of this region, and lower in both east and west sides.Meanwhile, the patterns of estimated Tair were influenced by the distance to the coastline.Air temperature was relatively lower in the west of this region further away from the coastline than the east, and the lowest air temperature was found in the northwest.

Discussion
Our study showed that it was feasible to estimate Tair by simply building a linear model between Tair and LST.However, this model performed poorly in extreme temperatures.The variation of model performance mainly depended on the spatial heterogeneity (i.e., topography, land cover) and time cycle (i.e., seasonal alteration).

Spatial Uncertainties of Models
Complex topography could affect the relationship between Tair and LST data, and give rise to the model uncertainty in predicting Tair.Cooler conditions prevail in regions with high altitude due to a steep lapse rate under the influence of cold-air drainage [48].In stands with similar altitude, south-facing slopes tend to receive greater solar radiation than shaded areas, and have higher temperature.In the meantime, cold-air pooling or temperature inversions occur frequently in mountain regions, especially under weather conditions with calm wind and clear sky, which would lead to model uncertainties in predicting Tair from LST data [49].
Physiographic factors also influenced snow accumulation and melt.Yan et al. [50] found that Tair was lower if the land was covered with snow in winter.Additionally, the linear correlation between Tair and LST was weak during snow-covered periods in high mountains [22].This is because snow cover could enhance the visible reflectance, and reduce the solar radiation absorption of ground surface.Snow cover also strongly affected vegetation greening through influences on the biotic and abiotic environment.Air temperature would be reduced if the snow cover period lengthened [49].Shifts in vegetation greening could also change soil moisture content due to snow melt, and alter the land surface temperature.It would result in uncertainty in predicting Tair through LST data irrespective of physiographic factors.Thus, the poor model performance in our study in extreme low temperature during winter could be attributed to snow cover (Figure 5).
Land cover plays an important role in influencing the relationship between Tair and LST.Our results showed that the effectiveness of optimal models in predicting Tair varied in different land

Discussion
Our study showed that it was feasible to estimate T air by simply building a linear model between T air and LST.However, this model performed poorly in extreme temperatures.The variation of model performance mainly depended on the spatial heterogeneity (i.e., topography, land cover) and time cycle (i.e., seasonal alteration).

Spatial Uncertainties of Models
Complex topography could affect the relationship between T air and LST data, and give rise to the model uncertainty in predicting T air .Cooler conditions prevail in regions with high altitude due to a steep lapse rate under the influence of cold-air drainage [48].In stands with similar altitude, south-facing slopes tend to receive greater solar radiation than shaded areas, and have higher temperature.In the meantime, cold-air pooling or temperature inversions occur frequently in mountain regions, especially under weather conditions with calm wind and clear sky, which would lead to model uncertainties in predicting T air from LST data [49].
Physiographic factors also influenced snow accumulation and melt.Yan et al. [50] found that T air was lower if the land was covered with snow in winter.Additionally, the linear correlation between T air and LST was weak during snow-covered periods in high mountains [22].This is because snow cover could enhance the visible reflectance, and reduce the solar radiation absorption of ground surface.Snow cover also strongly affected vegetation greening through influences on the biotic and abiotic environment.Air temperature would be reduced if the snow cover period lengthened [49].Shifts in vegetation greening could also change soil moisture content due to snow melt, and alter the land surface temperature.It would result in uncertainty in predicting T air through LST data irrespective of physiographic factors.Thus, the poor model performance in our study in extreme low temperature during winter could be attributed to snow cover (Figure 5).Land cover plays an important role in influencing the relationship between T air and LST.Our results showed that the effectiveness of optimal models in predicting T air varied in different land cover types (Table 3).This variation could be introduced by the specific heat capacities of different land covers.Air temperature mainly depended on the heat transfer process which was strongly influenced by the local radiation budget [22].In general, barren land has lower heat capacity than forest.Hence, air was heated much quicker over barren land than forest.Vegetation could also alter latent heat flux, such as enhancing or reducing transpiration [51,52], and cool the T air in forests [53,54].The cooling effect was not explicitly incorporated in our models due to uneven distribution of meteorological stations across different vegetation types.The meteorological station was too scarce in some vegetation types.Thus, it was hard to take consider of vegetation type in our models.However, land cover also affected land surface albedo, and a slight warming was observed in winter due to dense canopy reducing the albedo of snow cover [49].Thus, the influence of land cover on estimating T air was conditional and time dependent.

Temporal Uncertainties of Models
The temporal variation also had strong influence on the estimation of T air from LST data.Seasonal vegetation greening changed the physical properties of land surface, such as vegetation coverage, snow cover, and soil moisture.Our study area is mainly covered with lush crops and pasture, as well as frondent forests during growing season; and it would turn into bare land, leafless forests, or snow cover afterwards (Figure 1).The variation of land cover due to seasonal vegetation greening enhanced the spatial heterogeneity of LST data, and increased the temporal uncertainties in predicting air temperature [55].It is important to consider temporal variation caused by vegetation greening on the estimation of T air from LST over complex terrain [22].
In addition to vegetation greening, air masses and aerosol affected the temporal variation in the performance of T air models as well.This region is characterized by a typical temperate continental monsoon climate, and controlled by the Siberian high pressure system in winter and the Hawaiian high pressure system in summer [56,57].The basic mechanism for estimating T air with LST relies on the intense heat exchange between land surface and the atmosphere, yet the air masses strike the local energy balance due to coming from different sources [21,51].Dry and cold air masses come from Siberia in winter; conversely wet and cool air masses come from the sea surface of the Hawaiian system in summer.Cool air moving over a warmer land surface is heated discriminately from below, due to the fact that moisture content is different in such air masses.Moisture content of air is an important control on thermal conduction between air and surface [22,58].A time lag existed for establishing the new balance between air and land surface [51].Both the moisture content and time lag could result in uncertainties in estimating T air from LST data.
Aerosol varies with season, and is an important component of the air that essentially determines the energy relationship between T air and land surface [21].In this region, there are more aerosols in winter than summer due to burning fossil fuel for heating.Aerosols can absorb short-wave solar radiation to warm the atmosphere, while they usually scatter and reflect short-wave radiations to cool the ground surface.The residence time of aerosols in the atmosphere is regional and seasonal, which affects the radiative forcing of aerosols [21].Huang et al. [59] also indicated that high concentration of aerosol retained the heat by acting, to some extent, as insulators that trapped outgoing radiation.
Our results showed that the performance of models estimating T min and T mean using nighttime LST were better than that estimating T max using daytime LST (Figure 5).Nighttime LST was stable and affected by fewer factors compared with daytime LST due to the absence of solar radiation [51,59].Generally, the land surface warms faster than ambient air in the day, and cools more rapidly during the night.The warmer land surface heated the ambient air to stimulate the turbulence of ground atmosphere in the day, while the turbulence turned weak on account of lacking solar radiation to heat land surface in the night.An isothermal and homogeneous surface was established over the ground surface in the night, and improved the estimation accuracy of T air .Shen and Leptoukh [55] indicated that the relationship between T max and daytime LST depended significantly on land cover, but T min and nighttime LST had little dependence on land cover, especially during the summer time.The influences of direct solar illumination and land cover resulted in more complex interactions between ambient air and land surface.That is why the daytime LST was not as good as the nighttime LST in predicting air temperature.

Spatial Variations of Annual Air Temperature
The spatial patterns of all the predicted annual T max , T min and T mean were relatively higher in the south, and lower in the north (Figure 9).In the south, the Earth's surface could receive more solar radiation to heat the ambient air, while the energy was relatively lower in the north.In this region, the mountains covered by forests are located in the north and both sides of this region, while the cities and farm fields are located in the center (Figure 1).Air temperature becomes low where the altitude increases, and the forests are also able to cool the air temperature.Meanwhile, the artificial buildings in the plains could absorb more solar radiation to heat the ambient air.The south and east parts are closer to the coastline, so the warmer vapor coming from the sea is able to heat the ambient air through latent heat fluxes.

Conclusions
The results of this study confirm that the eight-day average of T max , T min , and T mean can be accurately estimated using LST and auxiliary data.The simple multiple linear regression models were established for T max , T min , and T mean using different predictors for each value, but LST and day length were both relatively important predictors in all models.The performance of optimal models was efficient to estimate T max (RMSE = 4.63 • C, MFE = 0.89), T min (RMSE = 3.99 • C, MFE = 0.93) and T mean (RMSE = 3.60 • C, MFE = 0.93) for all meteorological stations.The performance was better for models predicting T min and T mean than that of T max .Nighttime LST data was better in predicting T min and T mean than daytime LST data in predicting T max .Nighttime or combining daytime and nighttime LST should be taken into account to improve the performance of the model for T max .
The model performance varied across different spatial and temporal scales.The models performed well in the plains, and relatively poorly in the complex terrain of the mountains.Land covers had noticeable influences on estimating air temperature, and seasonal changes of land covers could result in temporal variations of model performance.The RMSE values for the models during the summer were slightly better than the other seasons.The changes in physical and chemical composition of near-surface atmosphere may also contribute to the models' temporal variations.More predictors should be considered for the purpose of improving the models estimating air temperature.It is possible that non-linear models or advanced ensemble models such as boosted regression trees may produce better estimation for T air .

Figure 1 .
Figure 1.The geographic location of the study area, different vegetation types, and the distribution of the meteorological stations.

Figure 1 .
Figure 1.The geographic location of the study area, different vegetation types, and the distribution of the meteorological stations.

Figure 2 .
Figure 2. Best models for each subset size based on Adjusted R-squared, (a) Tmax, (b) Tmin, and (c) Tmean variable selection.LD: 8-Day daytime 1 km grid land surface temperature, AD: Average view zenith angle of daytime land surface temperature, CD: the days in clear sky conditions and with valid daytime land surface temperature observation, LN: 8-Day nighttime 1 km grid land surface temperature, AN: Average view zenith angle of nighttime land surface temperature, CN: the days in clear sky conditions and with valid nighttime land surface temperature observation, Lat: Latitude, Lon: Longitude, Ele: Elevation, Dis: The distance to the coastal line, D: Julian day, DL: Mean day length of 8-Day day length, NDVI: Normal difference vegetation index.

Figure 2 .
Figure 2. Best models for each subset size based on Adjusted R-squared, (a) T max , (b) T min , and (c) T mean variable selection.LD: 8-Day daytime 1 km grid land surface temperature, AD: Average view zenith angle of daytime land surface temperature, CD: the days in clear sky conditions and with valid daytime land surface temperature observation, LN: 8-Day nighttime 1 km grid land surface temperature, AN: Average view zenith angle of nighttime land surface temperature, CN: the days in clear sky conditions and with valid nighttime land surface temperature observation, Lat: Latitude, Lon: Longitude, Ele: Elevation, Dis: The distance to the coastal line, D: Julian day, DL: Mean day length of 8-Day day length, NDVI: Normal difference vegetation index.
Among candidate models of Tmin, model 11 was optimum.The RMSE (3.99 °C), MAE (3.09 °C), and the confidence interval (−8.04 °C-7.84 °C) of Model 11 were smaller than others.Although Model 10 and Model 11 included the same number of variables, and the MFE, CRM and MBE were the same (Table

Figure 3 .
Figure 3. Independent variable's standardized regression coefficient for estimating air temperature, (a) T max , (b) T min , and (c) T mean .
Among candidate models of T min , model 11 was optimum.The RMSE (3.99 • C), MAE (3.09 • C), and the confidence interval (−8.04 • C-7.84 • C) of Model 11 were smaller than others.Although Model 10 and Model 11 included the same number of variables, and the MFE, CRM and MBE were the same (Table

Figure 4 .
Figure 4. (a-f) Relative bias distribution considering all eight days Tmax ((a) model 1, (b) model 7), Tmin ((c) Model 8, (d) model 11) and Tmean ((e) model 12, (f) model 15).The blue curves are normal density distribution curves, red curves are kernel density distribution curves, and green lines show the lower endpoint and upper endpoint of the relative bias 95% confidence interval.

Figure 5 .
Figure 5. Scatter plots between predicted air temperature (°C) and observed air temperature (°C) of (a) Tmax, (b) Tmin, and (c) Tmean.The blue line was the relationship between predicted air temperature and observed air temperature, the green line is 1:1 line, which represents the relationship of observed air temperature = predicted air temperature.

Figure 4 .
Figure 4. (a-f) Relative bias distribution considering all eight days T max ((a) model 1, (b) model 7), T min ((c) Model 8, (d) model 11) and T mean ((e) model 12, (f) model 15).The blue curves are normal density distribution curves, red curves are kernel density distribution curves, and green lines show the lower endpoint and upper endpoint of the relative bias 95% confidence interval.

Figure 4 .
Figure 4. (a-f) Relative bias distribution considering all eight days Tmax ((a) model 1, (b) model 7), Tmin ((c) Model 8, (d) model 11) and Tmean ((e) model 12, (f) model 15).The blue curves are normal density distribution curves, red curves are kernel density distribution curves, and green lines show the lower endpoint and upper endpoint of the relative bias 95% confidence interval.

Figure 5 .
Figure 5. Scatter plots between predicted air temperature (°C) and observed air temperature (°C) of (a) Tmax, (b) Tmin, and (c) Tmean.The blue line was the relationship between predicted air temperature and observed air temperature, the green line is 1:1 line, which represents the relationship of observed air temperature = predicted air temperature.

Figure 5 .
Figure 5. Scatter plots between predicted air temperature ( • C) and observed air temperature ( • C) of (a) T max , (b) T min , and (c) T mean .The blue line was the relationship between predicted air temperature and observed air temperature, the green line is 1:1 line, which represents the relationship of observed air temperature = predicted air temperature.

Figure 6 . 19 Figure 6 .
Figure 6.(a-l) T max , T min, and T mean model performance distribution for all the meteorological stations used: RMSE (a,e,i); MEF (b,f,j); MBE (c,g,k) and MAE (d,h,l).

Figure 8 .
Figure 8. Relations between the temporal distribution of data availability and model performance, (a) Tmax, (b) Tmin, and (c) Tmean.Data availability is the percentage (%) of the valid data used in model calibration/validation.

Figure 8 .
Figure 8. Relations between the temporal distribution of data availability and model performance, (a) T max , (b) T min , and (c) T mean .Data availability is the percentage (%) of the valid data used in model calibration/validation.

Figure 9 .
Figure 9.Estimated annual temperature of eight-day (a) T max , (b) T min, and (c) T mean integrated over the entire study period (2002-2016).

Table 1 .
Data sources and variables.

Table 2 .
Model equations and validation accuracy for T max , model accuracy is given by Adj R 2 , RMSE, MAE, MFE and R 2 change.
LD: 8-Day daytime 1 km grid land surface temperature, LN: 8-Day nighttime 1 km grid land surface temperature, NDVI: Normal difference vegetation index, DL: Mean day length of 8-Day day length, D: Julian day, Ele: Elevation, Lat: Latitude, CD: the days in clear sky conditions and with valid daytime LST, CN: the days in clear sky conditions and with valid nighttime LST.

Table 3 .
The performance of optimal models for T max , T min , and T mean for different land cover types.Model accuracy is given by RMSE, MEF, CRM, MAE, and MBE.

Table 4 .
The performance of optimal models for T max , T min , and T mean in different seasons.Model accuracy is given by RMSE, MEF, CRM, MAE, and MBE.

Table 5 .
The performance of optimal models for T max , T min , and T mean in different months.Model accuracy is given by RMSE, MEF, CRM, MAE, and MBE.