A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation

Yang, Dewei; Zhong, Shaobo; Mei, Xin; Ye, Xinlan; Niu, Fei; Zhong, Weiqi

doi:10.3390/rs15041136

Open AccessArticle

A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation

by

Dewei Yang

¹,

Shaobo Zhong

²,

Xin Mei

^1,*,

Xinlan Ye

¹,

Fei Niu

¹ and

Weiqi Zhong

¹

Faculty of Resources and Environmental Science, Hubei University, Wuhan 430062, China

²

Institute of Urban Systems Engineering, Beijing Academy of Science and Technology, Beijing 100035, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(4), 1136; https://doi.org/10.3390/rs15041136

Submission received: 18 January 2023 / Revised: 15 February 2023 / Accepted: 17 February 2023 / Published: 19 February 2023

(This article belongs to the Section Environmental Remote Sensing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Near-land surface air temperature (NLSAT) is an important meteorological and climatic parameter widely used in climate change, urban heat island and environmental science, in addition to being an important input parameter for various earth system simulation models. However, the spatial distribution and the limited number of ground-based meteorological stations make it difficult to obtain a large range of high-precision NLSAT values. This paper constructs neural network, long short-term memory, bi-directional long short-term memory, support vector machine, random forest, and Gaussian process regression models by combining MODIS data, DEM data, and meteorological station data to estimate the NLSAT in China’s mainland and compare them with actual NLSAT observations. The results show that there is a significant correlation between the model estimates and the actual temperature observations. Among the tested models, the random forest performed the best, followed by the support vector machine and the Gaussian process regression, then the neural network, the long short-term memory, and the bi-directional long short-term memory models. Overall, for estimates in different seasons, the best results were obtained in winter, followed by spring, autumn, and summer successively. According to different geographic areas, random forest was the best model for Northeast, Northwest, North, Southwest, and Central China, and the support vector machine was the best model for South and East China.

Keywords:

air temperature estimation; neural network; support vector machine; random forest; Gaussian process regression

1. Introduction

Near-land surface air temperature (NLSAT) is the value of air temperature at 1.5–2 m from the ground [1]. Currently, the global temperature is in the process of increasing. This will bring serious threats such as melting glaciers, rapid evaporation, and extreme weather and will eventually act on people’s daily life and bring a series of losses and survival challenges [2]. Additionally, NLSAT is an input parameter for a variety of surface models and is widely used in various studies on physiology, hydrology, meteorology, and the environment [3]. Therefore, the accuracy of NLSAT is essential for understanding earth surface processes and global change processes.

Although there have been a total of 837 meteorological stations established so far in China’s mainland from where accurate NLSAT can be obtained, the relatively sparse distribution and huge cost of construction make it difficult for researchers to obtain timely temperature information on a large scale. Based on meteorological station observations, spatial interpolation techniques can obtain large scale air temperature data. However, the universality of individual meteorological station data will be reduced by the influence of topography, vegetation cover and other geographical elements [4], making it hard to obtain a large range of high-precision NLSAT.

The advent of remote sensing techniques has compensated for the difficulty of obtaining earth surface data on a large scale. Thermal infrared sensors can obtain land surface temperature to a large spatial extent and in long time series, but the NLSAT is different from land surface temperature and is highly dependent on land cover and environmental factors. Nonetheless, it is feasible to obtain a nationwide distribution of NLSAT by establishing the relationship between temperatures from satellite remote sensing and meteorological stations, which has triggered numerous discussions among scholars [5]. Chen established a linear correlation between geostationary satellite data and NLSAT in Florida, and the deviation of the results was 1.57 °C [6]. Kawashima investigated linear regression relation between Landsat satellite data and NLSAT [7]. Vancutsem explored the linear relationship between surface temperature data and actual air temperature in Africa based on MODIS data, concluding that the night data could be better used to estimate NLSAT [8]. However, in their study, only using the surface temperature as predictor was far from enough, and many other factors need to be considered. Based on this, Kawashima developed a normalized vegetation index factor [7]. Cresswell added the solar zenith angle factor to analyze the NLSAT [9]. Florio combined surface temperature, latitude and longitude, and elevation to estimate NLSAT [10]. Cristobal used the Landsat, MODIS, AVHRR and other satellite data to estimate the NLSAT by multiple linear regression [11]. These studies further reduced errors in estimates of NLSAT. As seen, NLSAT is an important parameter of surface processes, which is affected by many factors, and multi-variate analysis can effectively improve the estimation.

With the development of computer science, many computational models have emerged, bringing opportunities for NLSAT estimation. Zhao inputted vegetation index, DEM (Digital Elevation Model), and surface temperature into a neural network to train models and then used them to predict NLSAT [12]. Lins and Moser used a support vector machine (SVM) method to estimate NLSAT based on remote sensing data [13,14]. Abbot used artificial neural network to estimate historical air temperatures during the twentieth century, and the deviation of the results from the true value was small [15]. Mehrkanoon proposed a model for temperature prediction based on a deep learning framework [16]. Tao used a new hybrid model which combined the LSTM model with the random forest model, incorporating wind direction, wind speed and relative humidity to predict NLSAT. The model was found to be more accurate than BP neural network and SVM models [17]. He et al. demonstrated that the Gaussian process regression model worked best [18]. We summarized the previous studies in Table 1. Most of the existing studies examined few models and focused on a single study area, and fewer studies have been conducted for China’s mainland, where there is complex and diverse topography with obvious geographical heterogeneity in temperature.

The distribution of NLSAT affects the area and time of crop cultivation, people’s living needs, lifestyles and socio-economic development. China is a large agricultural country with a large population and the world’s second largest economic body. Therefore, NLSAT has far-reaching implications for China. The main purpose of this study is to compare the capability of several popular models in NLSAT estimation and then propose a better NLSAT estimation scheme. The findings are expected to support the obtainment of accurate NLSAT distribution data over a large area and provide a scientific basis for research into surface air temperature, climate change and agricultural production in China. First, we combine MODIS satellite remote sensing data, DEM data, and meteorological station data, and use various models such as neural network, LSTM, BiLSTM, support vector machine, random forest, and Gaussian process regression to estimate the NLSAT in China’s mainland. Then, we explore the applicability of the models for different seasons and in different geographic areas. Finally, we discuss the reasons for the differences in the results of the models and the correlation between the accuracy of NLSAT estimation and seasonal and geographical environmental factors.

2. Materials and Methods

2.1. Related Data

2.1.1. Meteorological Station Data

The meteorological station data are from the China National Meteorological Information Center (http://data.cma.cn/, accessed on 16 February 2023) and includes several products such as surface weather information, surface climate information, and near-surface boundary layer observation information in China. These data products are widely used in various research fields such as drought, ecology, and climate change. Additionally, the data from meteorological stations have been quality controlled, the data availability of each element exceeds 99.9%, and the correctness rate of all data are close to 100%. In this study, we collected the monthly mean air temperature data of meteorological stations from 2007–2016, and the stations with missing values were eliminated during the period to generate the January–December station air temperature data set, and MODIS and DEM data were overlaid to produce the training data set of the corresponding model. Figure 1 shows the location distribution of each weather station across the study area (China’s mainland).

2.1.2. Elevation Data

The DEM data are derived from SRTM (the Shuttle Radar Topography Mission) (https://srtm.csi.cgiar.org/, accessed on 16 February 2023). The data are produced through the cooperation of several official agencies and accurately reflect the topographic terrain and surface elevation changes, covering more than 80 percent of the global surface. In this study, DEM data was used for model training and testing of the final prediction results. Figure 1 also shows the corresponding elevation situation in China.

2.1.3. MODIS Data

MODIS is a sensor on board the polar-orbiting satellites Terra and Aqua that observes the entire surface of the Earth every one to two days and is an effective data source for a wide range of surface remote sensing parameters. The data come from NASA (National Aeronautics and Space Administration, https://ladsweb.modaps.eosdis.nasa.gov/search/, accessed on 16 February 2023), and the remote sensing data products used in this study are surface temperature products (MOD11C3, MYD11C3) and vegetation index products (MOD13C2, MYD13C2).

We obtained o_day (Terra satellite daytime surface temperature) and o_night (Terra satellite nighttime surface temperature) from MOD11C3, y_day (Aqua satellite daytime surface temperature) and y_night (Aqua satellite nighttime surface temperature) from MYD11C3, o_evi (Terra Satellite Enhanced Vegetation Index) and o_ndvi (Terra Satellite Vegetation Index) from MOD13C2, and y_evi (Aqua Satellite Enhanced Vegetation Index) and y_ndvi (Aqua Satellite Vegetation Index) from MYD13C2. The above products are all monthly average data products, where the monthly surface temperature is averaged by daily data, and the monthly vegetation index is weighted average by 16 days of data.

The NLSAT is directly influenced by the surface temperature (the weakened radiation from the surface), and similarly the air temperature is influenced by the subsurface, where the vegetation absorbs a lot of radiation from the Sun to reduce the air temperature values [4]. Yu et al. pointed out that, with the continuous greening of vegetation in China, there is a tendency for the temperature to decrease, and so the influence of vegetation should be considered when predicting temperature [19]. Figure 2 shows the correlation between air temperature and the above eight data sets. As seen, the correlation between air temperature and surface temperature is up to more than 0.9, and the corresponding relationship between air temperature and vegetation index is about 0.68. The correlation between surface temperature, vegetation index and NLSAT is obvious, and so we chose to use surface temperature data and vegetation index to predict NLSAT. Table 2 shows the basic information of the data we used.

Among the above data, the meteorological station data belongs to point data, which are used as the monthly average in this study; the spatial resolution of elevation data is 90m; in addition, the resolution of MODIS data is 0.5° × 0.5°, which is also a monthly average data product. To unify the spatial resolution of the data, we resampled these data to the final spatial resolution: 0.5° × 0.5°. Because our data are all monthly averaged products, the temporal resolution is unified, and no corresponding processing is required.

2.2. Related Models

2.2.1. Neural Network Model

A neural networks (NN) is a complex network systems formed by plenty of simple processing units (called neurons) extensively interconnected with each other. Such a model reflects many basic features of human brain functions and are a highly complex nonlinear dynamical learning system. NNs have good self-learning capability. This techniqye has been widely used in many fields. The neural network model used in this study is divided into three layers: input layer, intermediate layer, and output layer (Figure 3), and the inputs and outputs in the model satisfy the following equations.

{Net}_{a} = \sum_{a = 1}^{n} w_{a} x_{a}

(1)

y_{a} {= f (Net}_{a})

(2)

where Net represents a single neuron, a denotes the number of neurons per layer, x denotes the element of input, y denotes the output value, and f denotes the activation function.

o_day, o_night, o_evi, o_ndvi, y_day, y_night, y_evi, y_ndvi, and DEM of the station data are extracted as the inputs of the NN model, and the air temperature values of the station are used as the outputs for model learning, and finally the prediction data set is used to obtain the corresponding prediction results and make an accuracy evaluation.

2.2.2. Long Short-Term Memory Network Model

The long short-term memory network (LSTM) is a special recurrent neural network that can effectively solve the long-term dependence problem and is a common model for sequence prediction [20]. The model reduces the corresponding prediction error by filtering the corresponding information through a special gate, and its corresponding model structure is shown in Figure 4, with the following equations.

f_{t} {= σ (W}_{xy} \times X_{t} {+ W}_{hf} \times H_{t - 1} {+ W}_{cf} \times C_{t - 1} {+ b}_{f});

(3)

i_{t} {= σ (W}_{xi} \times X_{t} {+ W}_{hi} \times H_{t - 1} {+ W}_{ci} \times C_{t - 1} {+ b}_{i});

(4)

B_{t} {= f}_{t} \times B_{t - 1} {+ i}_{t} \times {\tan h (W}_{xc} \times X_{t} {+ W}_{hc} \times H_{t - 1} {+ b}_{c});

(5)

o_{t} {= σ (W}_{x_{0}} \times X_{t} {+ W}_{h_{0}} \times H_{t - 1} {+ W}_{c_{0}} \times C_{t} {+ b}_{0};

(6)

H_{t} {= o}_{t} \times {\tan h (B}_{t})

(7)

where: σ is the sigmoid activation function; W is the weight value; X is the input data; H is the output data; B is the state information; b is the amount of deviation; tanh is the tanh activation function.

The model is trained by feeding it the training set, followed by feeding the prediction data set into the learned model for the prediction of NLSAT.

2.2.3. Bi-Directional Long Short-Term Memory Network Model

The bi-directional long short-term memory model (BiLSTM) is a variant of the LSTM model. The BiLSTM is obtained by combining two interrelated LSTMs, and the two LSTMs learn forward information and backward information, respectively. As a result, the output results are obtained by the combined processing of information in these two directions. Each one-way LSTM is consistent with the LSTM model algorithm. Compared with the LSTM model, the BiLSTM model retains more complete information about the long-term dependence on data. The structure of the model is shown in the following Figure 5.

2.2.4. Support Vector Machine

A support vector machine (SVM) is a learning model based on statistical learning theory [21,22] in which the model constructs the optimal hyperplane by mapping the input data onto a high-dimensional feature space, followed by approximation and regression in the new space. In the regression, the hyperplane is located near as many sample points as possible. In this paper, the hyperplane is constructed using the training set of weather station points with o_day, o_night, o_evi, o_ndvi, y_day, y_night, y_evi, y_ndvi, and DEM as the feature vectors, and the NLSAT is used as the prediction data, and then the prediction data set is used to validate the effect of the model.

2.2.5. Random Forest

Random forest (RF) is an integrated learning algorithm based on decision trees, which can be used not only for solving classification problems but also regression problems [23]. The method selects input data randomly, constructs several different independent decision trees, and selects the result with the highest number of votes based on the prediction results of multiple decision trees. Because of its strong randomness, it can effectively suppress the phenomenon of overfitting and is widely used in a variety of regression analyses. In this study, the random forest model is constructed by inputting the corresponding site training data set to generate a specific set of decisions, and finally the corresponding prediction data set is input to the model for the prediction of results.

2.2.6. Gaussian Process Regression

Gaussian process regression (GPR) is a nonparametric model for performing regression analysis of data using a Gaussian process prior. This is suitable for dealing with complex regression problems such as high-dimensional and nonlinear issues [24]. The model uses Bayesian methods and statistical theory to perform regression analysis on the corresponding data, and Gaussian process regression is used to predict the value of a function under arbitrary random variables by fitting the corresponding Gaussian process to a finite amount of high-dimensional data. In this study, GPR is trained for NLSAT estimation by applying the locations with station data, and then the relevant parameters without station data are input to predict the corresponding NLSAT.

2.3. Model Parameter Optimization

In order to exclude the influence of the models used on NLSAT estimation results, we tuned the parameters for each model itself to finally make each model perform the optimal prediction effect in this study. The NN model is optimized by modifying the number of layers and the number of samples per input model; the LSTM and BiLSTM models are optimized by increasing or decreasing the number of layers; the SVM model is optimized by modifying the penalty coefficient (P) and the kernel function coefficient (K). The effect of the RF model is mainly affected by the number of decision trees in the forest and so the model is tuned by adjusting the number of decision trees; the GPR model is optimized by using the Bayesian optimal tuning method.

2.4. Effectiveness Verification Indicators

For the evaluation of the accuracy of the model, this study uses three indicators, root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R²), to verify the estimation results. The smaller the value of RMSE and MAE, the larger the value of R², and the better the prediction of the model. The three indicators are calculated as follows.

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {{(o}_{i} {- p}_{i})}^{2}}

(8)

MAE = \frac{1}{n} \sum_{i = 1}^{n} |o_{i} {- p}_{i}|

(9)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {{(o}_{i} {- p}_{i})}^{2}}{\sum_{i = 1}^{n} {{(o}_{i} - \bar{o})}^{2}}

(10)

where o denotes the actual value of the station air temperature and p denotes the model predicted value of the air temperature.

\bar{o}

denotes the mean value of station air temperature.

2.5. Research Methodology

Figure 6 shows the main research idea of this study. First, the NLSAT at the station is estimated by combining MODIS surface temperature data, MODIS vegetation index data, and DEM data at the weather station; second, the optimal model is selected by parameter tuning, as detailed in Section 2.3; then, the MODIS surface temperature data, MODIS vegetation index data, and DEM data are input into the optimal model. Finally, the predicted temperature values of the corresponding sites are compared with the actual values and the MODIS surface temperature.

3. Results

The monthly averages of NLSAT of meteorological stations from 2007–2016 were combined with monthly satellite remote sensing data to obtain samples for the corresponding months, and then the sample data were integrated according to the same month. In the process, samples with null values were excluded. The total amount of data for each month and the proportion of the training and validation data set are shown in Table 3, and the models were trained by month. After finishing model training, the monthly satellite remote sensing data were input into the trained models to estimate monthly mean NLSAT at a spatial resolution of 0.5° × 0.5°, and the model prediction results were used for comparison and analysis.

3.1. Comparison of the Accuracy of Different Models

We extracted the prediction results of the six models and MODIS satellite data according to the meteorological stations, and then calculated the RMSE, MAE, and R² using the observed temperature data and the corresponding estimates at the meteorological stations. Figure 7 shows the boxplots of these error indicators. The ordering of the mean RMSE values of various results is RF < SVM < GPR < MODIS < LSTM < BiLSTM < NN model, and the corresponding values are 2.01 °C, 2.33 °C, 2.92 °C, 7.12 °C, 7.14 °C, 7.47 °C, and 7.98 °C. The ordering of the mean MAE values is RF < SVM < GPR < MODIS < LSTM < BiLSTM < NN model, and the corresponding values are 1.36 °C, 1.58 °C, 1.89 °C, 5.37 °C, 6.21 °C, 6.67 °C, 7.37 °C. The ordering of the mean R² values is MODIS < LSTM < BiLSTM < GPR < NN < SVM < RF, and the corresponding values are 0.45 °C, 0.72 °C, 0.758 °C, 0.76 °C, 0.84 °C, 0.86 °C, and 0.89 °C. The results showed that the RF model had the lowest estimation errors with all the three indicators. The RMSE of the LSTM, BiLSTM model and NN model had the highest estimation error with MAE. NN had better results with R².

Overall, the estimation results of the RF model have the smallest error, and the corresponding R² values are above 0.8. The results are worst for the LSTM and BiLSTM models, and the range of R² values fluctuates from 0.1 to 0.9, showing unstable performance depending on the season. For different months, the most suitable model for NLSAT estimation is the RF model.

In addition, the models required parameter tuning to perform model comparisons. In the process of parameter tuning, we found that the performance of several models was relatively stable. As such, the adjustment of several hyper parameters could not significantly improve the prediction accuracy of the models, as shown in Figure 8. This may be because of the obvious relationship between the input elements of the training set and the temperature, resulting in the model being able to quickly learn the correlation between them and achieve a stable expression.

3.2. Comparison of Model Accuracy in Different Seasons

To examine the effects of these models in different times, we compared the estimated results by different seasons. In the meteorological sector, March–May is usually considered as spring, June–August as summer, September–November as autumn and December–February as winter, and often January, April, July and October are used as representative months for the four seasons of winter, spring, summer and autumn. Therefore, we choose the forecast results of April, July, October, and January to represent temperature conditions of the corresponding seasons.

Figure 9 shows the estimated results for the spring of 2015: the NN model has high temperature estimates in the northwest and southwest regions, and the prediction results in other regions are consistent with the corresponding temperature distribution in China. The estimation results of the LSTM model show a trend of overestimation in southern China and normal results in other regions. The prediction result of the BiLSTM model has a large area of error. There is similar temperature distribution throughout the northern and Qinghai–Tibet Plateau regions of China, and there are significantly fewer high temperature areas in the southern region. The results of the RF and SVM models are similar, with low temperatures in central and eastern China, and some high temperature areas missing in northwest China. The results of the GPR model show estimation anomalies in the Qinghai–Tibet Plateau region and the northwest region. The single MODIS surface temperature data exhibited high errors, which did not match the actual temperature distribution and had a significant trend of high bias. The spring results show that among the six models, the NN model has the best prediction effect, followed by the RF and SVM models, then the LSTM model, the GPR model, and the BiLSTM model successively.

Figure 10 shows the estimated results for the summer of 2015. The estimated temperature values of the NN model are generally high in regions other than the Qinghai–Tibet Plateau. NLSAT in most areas are higher than 30 °C without significant geographical variability, but there is vertical variability in temperature in the Qinghai–Tibet Plateau and Tianshan regions. The LSTM and BiLSTM models also show similar estimation results as the NN model, and the estimated temperatures are generally high in inland China. However, the prediction results of these two models are better than the NN in the Qinghai–Tibet Plateau region. The prediction results of the RF model and the SVM model show a slight underestimation in the high temperature regions. In addition, homogenization also occurs in the Qinghai–Tibet Plateau region, which does not reflect the low temperature effect at high altitude regions in summer. The estimation results of the GPR model show a uniform phenomenon across the country, which is not consistent with the actual temperature distribution. The single remote sensing data product exhibits high temperature distribution all over the mainland of China due to the strong solar radiation in summer. Therefore, among the summer prediction results of the six models, the GPR model has the worst effect. The prediction results of several other models also did not show the actual temperature distribution. Moreover, from region to region, the effects of temperature estimation also varies.

Figure 11 shows the estimated results for the autumn of 2015, and the NN model has high temperature estimates in the central and southern regions. The prediction results of the LSTM and BiLSTM models are similar, with large temperature estimation anomalies, high estimates in the central and southern regions, and low estimates in the northwest region. In addition, the prediction results of the RF model and the SVM model are similar, with high temperatures in the northwest and northeast regions. Additionally, the Qinghai–Tibet Plateau region does not show the expected temperature differences. The results of the GPR model show anomalies in the Qinghai–Tibet Plateau region and the northwest region, failing to show the effects of topographic conditions on temperature. The MODIS surface temperature data values were generally high for NLSAT. The results in autumn show that among the six models, RF and SVM models had the best predictions, followed by NN models, then GPR models, and LSTM and BiLSTM models had the worst results.

Figure 12 shows the estimated results for the winter of 2015. The estimation results of the NN model are good and conform to the actual temperature distribution in China. but the low temperatures in the Qinghai–Tibet Plateau and Tianshan region were not detcted well. The estimation results of the LSTM model show that the temperature values are generally low throughout China, and the model does not recognize the low temperature in the northeast and northwest regions well. The prediction results of the BiLSTM model also show errors in large areas, with low temperatures in the north and significantly fewer high temperature areas in the south throughout China, but responded to low temperatures in the northeast and northwest regions. The prediction results of the RF model and the SVM model are similar, with better overall temperature estimates, failing to estimate the low temperatures in the Tibetan Plateau and Tianshan region. In the Tibetan Plateau, Northwest and Northeast China, the temperature estimation results of the GPR model are high, and there is no obvious regional difference. The single MODIS surface temperature data reflect the temperature distribution more accurately, indicating that solar radiation in winter is one of the main determinants of temperature. The winter results show that among the six models, the NN, RF, and SVM models have the best prediction effects, followed by the GPR model, and the LSTM model and BiLSTM model have the worst effects.

The estimation results of the four seasons of 2015 show that the NN model is the best in spring, RF and SVM are the best in summer and autumn, and the RF model results are the best in winter. The model performance is unstable with different seasons, and there is no absolutely good model for predicting NLSAT. The potential reason is that the typical monsoon climate in China brings four distinct seasons and seasonal changes in air temperature, and the changes are complex, thus leading to the inability to well estimate NLSAT in all seasons for a certain model. However, a combination of multiple models can be used to better obtain the NLSAT for different seasons.

3.3. Comparison of Model Accuracy in Different Regions

The seven geographic divisions used in this paper are the results of many years of scientific research by many authoritative experts in the natural geographic zoning of China. Mainly based on the different regional characteristics, the characteristics of geographical location, natural geography and human geography, and following the principles of combining zonality and non-zonality, the principle of major factors and comprehensive analysis, the principle of relative consistency, the principle of occurrence and the principle of geographical conjugation, China is divided into seven geographical divisions: northeast, east, north, central, south, southwest and northwest. At present, this division is commonly used by teachers and students of geography majors in Chinese universities, and is a common way of geographical zoning in China [25,26,27]. The specific regional divisions are shown in Figure 13. According to the geographical division of China, we divide the final estimated data into seven sub-regions (Northeast, North, Central, East, South, Southwest, and Northwest) to explore the geographical applicability of various models.

Figure 14a shows the mean values of RMSE for different models and regions. The RF model has the smallest RMSE value in northeast, southwest, and north China, and the NN model has the largest RMSE value in northeast region. The RF model has the best effects in Northwest China, and the single MODIS data have the largest errors. In Central China and South China, the minimum value of RMSE is from the SVM model and the maximum value is from the NN model, indicating that the SVM model works best with high accuracy in these two regions. Additionally, in East China the minimum value of RMSE is derived from GPR model, and the maximum value is derived from MODIS satellite data. These results show that different regions have their optimal models, that the RF model has a stable performance in all regions, and that the NN model has the worst results.

Figure 14b shows the mean value of MAE for different models and regions, with the smallest values from the RF model and the largest values from the NN model in Northeast, Southwest, and North China. In the northwest region, the RF model works best, and the single MODIS data are the worst. In Central and Southern China, the RF model and SVM model parameters do not differ much, and both have less parameter values and can effectively perform NLSAT estimation. In East China, the RF model, SVM model, and GPR model can all achieve good prediction results. The results of MAE show that the RF model, SVM model, and GPR model have good performances for NLSAT estimation, and they vary from region to region.

Figure 14c shows the magnitude of R² values for different regions. The maximum values of R² are as follows: Northwest, Southwest, North and Central China have the highest for the RF model, while South China has the highest for the SVM model and East China for the GPR model. The minimum values of R² are as follows: Northeast and South China have minimum values for the LSTM model, and Northwest, Southwest, Central, East, and North China have the lowest values for the MODIS surface temperature product. The corresponding results show that the models are geographically applicable, and the RF model shows better accuracy in several regions and is the most stable among the models. The LSTM series models are less practical for NLSAT estimation, and a single remote sensing data product is also not an accurate representation of the NLSAT situation.

To reflect the utility of different models during different months in multiple regions, we calculated the model parameters for each month in seven regions in 2015, and the specific results are shown in Figure 15, Figure 16 and Figure 17. The results in winter are significantly better than those in summer, and the results in spring and autumn vary between summer and winter, showing consistency with the seasonal results. In addition, the RF model, SVM model, and GPR model have stable parameters in each month, while the LSTM, BiLSTM, and NN models exhibit greater volatility. The model has large errors in Central, Southern and Eastern China, mainly due to the errors in the monitoring of MODIS surface temperature data in these areas. These three areas are the areas receiving more surface radiation in China, leading to MODIS data having high values. In addition, due to the complex variability of NLSAT in these three regions being caused by the monsoon climate, a single MODIS land surface temperature product cannot fully depict the NLSAT distribution in these areas and struggles to obtain accurate and detailed near-surface distributions.

4. Discussion

According to the results, the RF model has higher accuracy and reliability in estimating NLSAT over a wide area. This is mainly because the RF model can avoid overfitting by taking the statistical results of the decision tree as the final result, which makes the final result more consistent with the actual temperature situation. In the training process of other models, each use stable structured expressions, and the predicted temperature results are be generated in a specific structural formula, lacking the randomness expression of the random forest, making the final results less accurate than the RF model [20,21,22,23,24]. The noise of the data and the size of the sample size may lead to overfitting of the model. Although we have adjusted the model performance for different models in our study and eliminated the corresponding null values in the sample data set, the data characteristics of the sample data set itself, as well as the possible overfitting phenomenon and the limitations of the model itself, may also affect model estimation results. In addition, in this study, we found that the practical application of LSTM and BiLSTM models is not good, that they are more suitable for application in sequence prediction, and that the practical effect in conducting regression analysis is weaker than that of the other models. Therefore, we can find that the applicability of the model to different research is also one of the influencing factors leading to the different results.

During the comparative analysis of the model results in different seasons, it was found that the model would perform better in winter than in summer. The model prediction results are based on the MODIS surface temperature data product, which is influenced by the accuracy of the satellite data product itself, and the results vary from season to season. From the viewpoint of the MODIS data product itself, the observation of these data in winter is more consistent with the actual temperature distribution in China than in summer. Therefore, using MODIS data as the input source of the model will also make the prediction results in winter better than those in summer. In addition, MODIS data are influenced by surface radiation waves, which makes the temperature prediction results in different regions differ. For example, the snow cover in the Qinghai–Tibet Plateau region will undoubtedly affect the inversion results of the satellite remote sensing products and the input of remote sensing data will affect the final temperature prediction results. This effect also exists in the snow- and ice-covered northwest and northeast regions in winter. Meanwhile, the temperature difference caused by China’s typical monsoon climate also affects the applicability of the model in different seasons. In winter, the temperature difference in different regions of China is obvious, and the model has sufficient regional samples to simulate the actual situation during the training process, which makes the predictive effect of the model obvious. In summer, when China is generally hot, the learning samples of the model are relatively single, and the model cannot be adequately trained, so the predictive results are poor, and the phenomenon of large area of high temperature prediction weakens the influence of topographical factors, which makes the final results show that winter is better than summer.

The results for different regions show that the most suitable models are RF for Northwest, Northeast, Southwest, North and Central China, SVM for South China, and GPR for East China. Again, there is no consistently better model to predict the NLSAT in different regions, mainly because China is a vast area with diverse and complex ranges and geographic elements and varied geographic distributions. These factors significantly affect the distribution of NLSAT, which makes it hard to express the distribution pattern of NLSAT in different regions by one model. However, we can choose an applicable model with which to estimate the distribution of NLSAT through a results comparison of multiple models.

NLSAT is dependent on a combination of factors such as latitude, topography, land and sea position, ocean currents, subsurface, human activities, and extreme weather events, etc. However, considering the problems of quantifying the influencing factors, data collection and data accuracy, we selected only some representative influencing factors. Thus, our study suffers from the problem of incomplete consideration of the influencing factors. This problem may depend on the further development of monitoring technology to fully obtain the influencing factors for more accurate prediction of near-surface temperature. In addition, the method we use is a model learning method, which belongs to the category of nonlinear regression, and the weight relationships among multiple influencing factors are not as obvious as those of linear regression analysis. the process of tuning the parameters of multiple models is also done to a certain extent to avoid colinearity, which minimizes the relevant influence [28,29,30].

5. Conclusions

In this paper, the actual NLSAT values of meteorological stations and remote sensing data are used as the basic data for the study. Various models are employed to estimate the NLSAT distribution on a regional scale in China. The correlation between the NLSAT estimates and the actual values of the stations is constructed. Finally, the NLSAT estimation capability of various models is obtained.

There is an obvious correlation between the estimation results of the model and the actual values, and the distribution of the representative NLSAT cannot be accurately obtained by using remote sensing data products only. Combining remote sensing data with other data can obtain a more accurate and finer NLSAT. In terms of the performance of individual models, the RF model can avoid the overfitting phenomenon by taking the statistical results of the decision tree as the final result as its performance is the best and has high accuracy. The next best performers were the SVM model and the GPR model, followed by the NN model, and the LSTM and BiLSTM models successively. The prediction results of the models in different seasons showed that the best model estimation results were obtained in winter, followed by spring and autumn, and the worst results were obtained in summer, which was mainly influenced by the inversion accuracy of remote sensing data themselves. In terms of geographical distribution, the optimal model is the RF model in Northeast, Northwest, North China, Southwest and Central China, the optimal model is the SVM model in South China and East China, the optimal models are different for different geographical areas, and there is no absolute optimal model. Therefore, for different regional scales and time scales, the optimal model should to estimate the NLSAT be selected according to the actual situation of the study area and the characteristics of the model itself.

In summary, our method of estimating NLSAT using models combined with remote sensing data can provide a suitable model selection scheme for NLSAT estimation studies. On the other hand, it also provides a certain methodological reference for related studies that can be carried out using the model prediction scheme in studies that need to use NLSAT on a regional scale. In addition, it also has some reference significance in policy formulation for agricultural and livestock development, drought prevention, and urban heat island response. For example, choosing the appropriate temperature to determine the appropriate crops for promotion, predicting the spatial development trend of drought, and constantly monitoring the urban heat island area can effectively improve the efficiency of economic development and reduce losses.

The actual prediction effect of NLSAT is influenced by geographical factors. Therefore, in the future, it is necessary to obtain more accurate and detailed data on geographic environmental factors as the inputs of the prediction model for the study area in order to obtain better temperature distribution results. For the situation that the usefulness of the model is low in summer, we will consider the method of adjusting the structure of the model or adding and changing influencing factors to solve the problem. We have only considered the effects of six models, but there are still many models that we have not used and so we will consider more models in the future. Meanwhile, we also need to improve the quality of the data set to overcome the effects of data on the models themselves. In addition, the integrity and accuracy of remote sensing data are affected by reflectivity and cloudiness, and using only a single satellite data product may introduce some uncertainty into the prediction results. Therefore, future studies should also combine multi-source satellite data to improve the data integrity and reliability of the study as much as possible.

Author Contributions

Conceptualization, S.Z. and D.Y.; methodology, D.Y. and S.Z.; software, X.M.; validation, D.Y. and F.N.; formal analysis, S.Z. and X.Y.; investigation, S.Z. and W.Z.; resources, X.M.; data curation, D.Y.; writing—original draft preparation, D.Y.; writing—review and editing, S.Z.; visualization, D.Y.; supervision, X.M.; project administration, X.M. and S.Z.; funding acquisition, X.M. and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially supported by National Natural Science Foundation of China (Grant no. 72174031) and the Youth Scholar Program of Beijing Academy of Science and Technology (Contract No. YS202004).

Data Availability Statement

Data used in this research are freely available online and upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, L.; Huang, X.; Wang, X. A Review on Air Temperature Estimation by Satellite Thermal Infrared Remote Sensing. J. Nat. Resour. 2014, 29, 540–552. [Google Scholar]
Portner, H.; Roberts, D.; Constable, A. IPCC, 2022: Summary for Policymakers; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
Zhu, X.; Zhang, Q.; Xu, C.Y.; Sun, P.; Hu, P. Reconstruction of high spatial resolution surface air temperature data across China: A new geo-intelligent multisource data-based machine learning technique. Sci. Total Environ. 2019, 665, 300–313. [Google Scholar] [CrossRef]
Qi, S.; Wang, J.; Zhang, Q.; Lu, C.; Zheng, L. Study on the Estimation of Air Temperature from MODIS Data. Natl. Remote Sens. Bull. 2005, 9, 570–575. [Google Scholar]
Zhu, S.; Zhang, G. Progress in Near Surface Air Temperature Retrieved by Remote Sensing Technology. Adv. Earth Sci. 2011, 26, 724–730. [Google Scholar]
Chen, E.; Allen, L., Jr.; Bartholic, J.; Gerber, J. Comparison of winter-nocturnal geostationary satellite infrared-surface temperature with shelter—Height temperature in Florida. Remote Sens. Environ. 1983, 13, 313–327. [Google Scholar] [CrossRef]
Kawashima, S.; Ishida, T.; Minomura, M.; Miwa, T. Relations between surface temperature and air temperature on a local scale during winter nights. J. Appl. Meteorol. Climatol. 2000, 39, 1570–1579. [Google Scholar] [CrossRef]
Vancutsem, C.; Ceccato, P.; Dinku, T.; Connor, S. Evaluation of MODIS land surface temperature data to estimate air temperature in different ecosystems over Africa. Remote Sens. Environ. 2010, 114, 449–465. [Google Scholar] [CrossRef]
Cresswell, M.; Morse, A.; Thomson, M.C.; Connor, S. Estimating surface air temperatures, from Meteosat land surface temperatures, using an empirical solar zenith angle model. Int. J. Remote Sens. 1999, 20, 1125–1132. [Google Scholar] [CrossRef]
Florio, E.; Lele, S.; Chi, Y.; Sterner, R.; Glass, G. Integrating AVHRR satellite data and NOAA ground observations to predict surface air temperature: A statistical approach. Int. J. Remote Sens. 2004, 25, 2979–2994. [Google Scholar] [CrossRef]
Cristóbal, J.; Ninyerola, M.; Pons, X.; Pla, M. Improving air temperature modelization by means of remote sensing variables. In Proceedings of the 2006 IEEE International Symposium on Geoscience and Remote Sensing, Denver, CO, USA, 31 July–4 August 2006; Volume 1, pp. 2251–2254. [Google Scholar]
Zhao, D.; Zhang, W.; Shijin, X. A neural network algorithm to retrieve nearsurface air temperature from landsat ETM+ imagery over the Hanjiang River Basin, China. In Proceedings of the 2007 IEEE International Geoscience and Remote Sensing Symposium, Barcelona, Spain, 23–28 July 2007; Volume 1, pp. 1705–1708. [Google Scholar]
Lins, I.; Araujo, M.; das Chagas Moura, M.; Silva, M.; Droguett, E. Prediction of sea surface temperature in the tropical Atlantic by support vector machines. Comput. Stat. Data Anal. 2013, 61, 187–198. [Google Scholar] [CrossRef]
Moser, G.; De Martino, M.; Serpico, S. Estimation of air surface temperature from remote sensing images and pixelwise modeling of the estimation uncertainty through support vector machines. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 332–349. [Google Scholar] [CrossRef]
Abbot, J.; Marohasy, J. The application of machine learning for evaluating anthropogenic versus natural climate change. GeoResJ 2017, 14, 36–46. [Google Scholar] [CrossRef]
Mehrkanoon, S. Deep shared representation learning for weather elements forecasting. Knowl.-Based Syst. 2019, 179, 120–128. [Google Scholar] [CrossRef]
Tao, Y.; Du, J. Temperature prediction using long short term memory network based on random forest. Comput. Eng. Design. 2019, 40, 737–743. [Google Scholar]
He, Q.; Wang, M.; Liu, K. Spatial Interpolation of Air Temperature Based on Machine Learning. Plateau Meteorol. 2022, 41, 733–748. [Google Scholar]
Yu, L.; Liu, Y.; Liu, T.; Yan, F. Impact of recent vegetation greening on temperature and precipitation over China. Agric. For. Meteorol. 2020, 295, 108197. [Google Scholar] [CrossRef]
Graves, A. Long short-term memory. Supervised sequence labelling with recurrent neural networks. In Studies in Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2012; pp. 37–45. [Google Scholar]
Salcedo-Sanz, S.; Deo, R.; Carro-Calvo, L.; Saavedra-Moreno, B. Monthly prediction of air temperature in Australia and New Zealand with machine learning algorithms. Theor. Appl. Climatol. 2016, 125, 13–25. [Google Scholar] [CrossRef]
Ghorbani, M.; Shamshirband, S.; Haghi, D.; Azani, A.; Bonakdari, H.; Ebtehaj, I. Application of firefly algorithm-based support vector machines for prediction of field capacity and permanent wilting point. Soil Tillage Res. 2017, 172, 32–38. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Seeger, M. Gaussian processes for machine learning. Int. J. Neural Syst. 2004, 14, 69–106. [Google Scholar] [CrossRef]
Fan, J.; Zhou, K.; Sheng, K.; Guo, R.; Chen, D.; Wang, Y.; Liu, H.; Wang, Z.; Sun, Y.; Zhang, J.; et al. Territorial function differentiation and its comprehensive regionalization in China. Sci. Sin. 2023, 53, 36–255. [Google Scholar] [CrossRef]
Liu, Y.; Wang, F.; Zhang, Z. Comprehensive assessment of “climate change-crop yield-economic impact” in seven sub-regions of China. Clim. Chang. Res. 2021, 17, 11. [Google Scholar]
Liu, M.; Guo, X.; Huang, F. Spatial and temporal distribution patterns of PM2.5 in seven major regions of China in 2016. In Proceedings of the Sixth Postgraduate Academic Forum, School of Public Health, Capital Medical University, Beijing, China, 15 June 2017; pp. 353–365. [Google Scholar]
Marquaridt, D. Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation. Technometrics 1970, 12, 591–612. [Google Scholar] [CrossRef]
Ng, A.Y. Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In Proceedings of the Twenty-First International Conference on Machine Learning, Banff, AB, Canada, 4–8 July 2004; p. 78. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: New York, NY, USA, 2013; Volume 112, p. 18. [Google Scholar]

Figure 1. Location map of meteorological stations and the rough elevation.

Figure 2. Data correlation heat map between mean NLSAT and considered environment factors.

Figure 3. The multilayer architecture diagram of a typical neural network model.

Figure 4. The architecture diagram of Long Short-Term Memory network model.

Figure 5. The architecture diagram of Bi-Directional Long Short-Term Memory network model.

Figure 6. The methodological framework and flowchart for evaluation and comparison of selected models.

Figure 7. RMSE, MAE and R² box plots for different models.

Figure 8. (a) NN tuning by batch size. (b) NN tuning by layer number. (c) LSTM tuning by layer number. (d) BiLSTM tuning by layer number. (e) SVM tuning situation. (f) RF tuning by number of random trees. (g) GPR tuning situation.

Figure 9. The NLSAT estimation results for the spring of 2015.

Figure 10. The NLSAT estimation results for the summer of 2015.

Figure 11. The NLSAT estimation results for the autumn of 2015.

Figure 12. The NLSAT estimation results for the winter of 2015.

Figure 13. Map of Geographical Divisions of China.

Figure 14. (a) RMSE mean values for different regions. (b) MAE mean values for different regions. (c) R² mean values for different regions.

Figure 15. RMSE values by month for different regions.

Figure 16. MAE values by month for different regions.

Figure 17. R² values by month for different regions.

Table 1. Some previous research work for NLSAT.

Reference	Study Area	Spatial Scale	Time Scale	Method	Accuracy
Kawashima et al. [7]	The Kanto plain and its Surrounding mountainous area	Station	Day	Linear regression, Multiple linear regression	R² About 0.8
Cresswell et al. [9]	Southern Africa	Station	Hour	Linear regression	R² About 0.82
Florio et al. [10]	the South-Central United States	Station	Day	Multiple linear regression	R² About 0.45
Cristobal et al. [11]	Catalonia (Northeast Spain)	Station	Day, Month	Multiple linear regression	R² About 0.79
Zhao et al. [12]	The upstream basin of the Hanjiang River China	30 m	Day	Neural Network	R² About 0.85
Lins et al. [13]	The Tropical Atlantic	Station	Day	SVM	R² About 0.3
Moser et al. [14]	The Provence-Alpes-Côte d’Azur (France)	5KM	Month	SVM	R² About 0.75
Abbot et al. [15]	Multigeographic area	Station	Year	Neural Network	R² About 0.6
Mehrkanoon et al. [16]	Netherlands, Belgium, Denmark	Station	Day	CNN, LSTM	MAE About 1~3
Tao et al. [17]	Nanjing	Station	Hour	RF, LSTM, SVM	MAE About 1.5
He et al. [18]	China	0.25°	Month	RF, SVM, GPR	R² About 0.7

Table 2. Main data sources and their use in this study.

Data Type	Abbreviation	Source	Data Use
Monthly average of air temperature	mean	Meteorological station	Station location model training, Accuracy verification
Elevation data	DEM	the Shuttle Radar Topography Mission	Station location model training, Full map range prediction input
Terra satellite daytime surface temperature	o_day	NASA’s MOD11C3 product	Station location model training, Full map range prediction input
Terra satellite nighttime surface temperature	o_night	NASA’s MOD11C3 product	Station location model training, Full map range prediction input
Terra Satellite Enhanced Vegetation Index	o_evi	NASA’s MOD13C2 product	Station location model training, Full map range prediction input
Terra Satellite Vegetation Index	o_ndvi	NASA’s MOD13C2 product	Station location model training, Full map range prediction input
Aqua satellite daytime surface temperature	y_day	NASA’s MYD11C3 product	Station location model training, Full map range prediction input
Aqua satellite nighttime surface temperature	y_night	NASA’s MYD11C3 product	Station location model training, Full map range prediction input
Aqua Satellite Enhanced Vegetation Index	y_evi	NASA’s MYD13C2 product	Station location model training, Full map range prediction input
Aqua Satellite Vegetation Index	y_ndvi	NASA’s MYD13C2 product	Station location model training, Full map range prediction input

Table 3. The data volume for each month and the proportion of the training and validation data set.

Month	Data Volume	Percentage of Training Set	Validation Set Percentage
January	7802	90	10
February	7849	90	10
March	7907	90	10
April	7919	90	10
May	7876	90	10
June	7830	90	10
July	7887	90	10
August	7889	90	10
September	7897	90	10
October	7929	90	10
November	7914	90	10
December	7820	90	10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, D.; Zhong, S.; Mei, X.; Ye, X.; Niu, F.; Zhong, W. A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation. Remote Sens. 2023, 15, 1136. https://doi.org/10.3390/rs15041136

AMA Style

Yang D, Zhong S, Mei X, Ye X, Niu F, Zhong W. A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation. Remote Sensing. 2023; 15(4):1136. https://doi.org/10.3390/rs15041136

Chicago/Turabian Style

Yang, Dewei, Shaobo Zhong, Xin Mei, Xinlan Ye, Fei Niu, and Weiqi Zhong. 2023. "A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation" Remote Sensing 15, no. 4: 1136. https://doi.org/10.3390/rs15041136

APA Style

Yang, D., Zhong, S., Mei, X., Ye, X., Niu, F., & Zhong, W. (2023). A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation. Remote Sensing, 15(4), 1136. https://doi.org/10.3390/rs15041136

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comparative Study of Several Popular Models for Near-Land Surface Air Temperature Estimation

Abstract

1. Introduction

2. Materials and Methods

2.1. Related Data

2.1.1. Meteorological Station Data

2.1.2. Elevation Data

2.1.3. MODIS Data

2.2. Related Models

2.2.1. Neural Network Model

2.2.2. Long Short-Term Memory Network Model

2.2.3. Bi-Directional Long Short-Term Memory Network Model

2.2.4. Support Vector Machine

2.2.5. Random Forest

2.2.6. Gaussian Process Regression

2.3. Model Parameter Optimization

2.4. Effectiveness Verification Indicators

2.5. Research Methodology

3. Results

3.1. Comparison of the Accuracy of Different Models

3.2. Comparison of Model Accuracy in Different Seasons

3.3. Comparison of Model Accuracy in Different Regions

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI