Near-Surface Air Temperature Retrieval Using a Deep Neural Network from Satellite Observations over South Korea

: Air temperature (Ta), deﬁned as the temperature 2 m above the land’s surface, is one of the most important factors for environment and climate studies. Ta can be measured by obtaining the land surface temperature (LST) which can be retrieved with the 11- and 12- µ m bands from satellite imagery over a large area, and LST is highly correlated with Ta. To measure the Ta in a broad area, we studied a Ta retrieval method through Deep Neural Network (DNN) using in-situ data and satellite data of South Korea from 2014 to 2017. To retrieve accurate Ta, we selected proper input variables and conditions of a DNN model. As a result, Normalized Difference Vegetation Index, Normalized Difference Water Index, and 11- and 12- µ m band data were applied to the DNN model as input variables. And we also selected proper condition of the DNN model with test various conditions of the model. In validation result in the DNN model, the best accuracy of the retrieved Ta showed an correlation coefﬁcient value of 0.98 and a root mean square error (RMSE) of 2.19 K. And then we additional 3 analysis to validate accuracy which are spatial representativeness, seasonal analysis and time series analysis. We tested the spatial representativeness of the retrieved Ta. Results for window sizes less than 132 × 132 showed high accuracy, with a correlation coefﬁcient of over 0.97 and a RMSE of 1.96 K and a bias of − 0.00856 K. And in seasonal analysis, the spring season showed the lowest accuracy, 2.82 K RMSE value, other seasons showed high accuracy under 2K RMSE value. We also analyzed a time series of six the Automated Synoptic Observing System (ASOS) points (i.e., locations) using data obtained from 2018 to 2019; all of the individual correlation coefﬁcient values were over 0.97 and the RMSE values were under 2.41 K. With these analysis, we conﬁrm accuracy of the DNN model was higher than previous studies. And we thought the retrieved Ta can be used in other studies or climate model to conduct urban problems like urban heat islands and to analyze effects of arctic oscillation.


Introduction
Air temperature (Ta), defined as the temperature at 2 m above the land surface, is one of the most important variables in regional and global weather models of the terrain and its characteristics [1,2]. Ta affects the rates of biotic processes in the ecosystem, including phonologies, growth, carbon, fixation, insolation, and respiration through vegetationmoisture/water relationships [3][4][5][6][7][8]. Thus, Ta is used in many areas of research that monitor climate change, global warming, and abnormal temperature phenomenon. The accuracy of Ta readings is essential, but it only represents a relatively small area because it is measured locally at in-situ stations.
In general remote sensing, most of retrieved data from satellite data were retrieved using various ways in a large area like atmosphere, land and ocean. For example, threshold summer and early autumn. Additionally, South Korea is surrounded on three sides by the sea; thus, both land and sea conditions affect the region's temperature. And especially, previous studies referred effect to cold surges by arctic oscillation [22,23]. We thought if we can retrieve Ta from satellite, the result can be used to analyze effects of arctic oscillation. For this reason, we chose South Korea as our study area to investigate the correlation between Ta and various conditions in the area.

Ta Data
In South Korea, the Korea Meteorological Administration operates an in-situ data collecting system, the Automated Synoptic Observing System (ASOS). ASOS has 102 locations throughout South Korea which have various land types, thus, we divided the land type into four categories, urban, plain, forest, mixed. Urban means case that ASOS point was surrounded by urban structures, such as building, factory, road. Plain means case that ASOS point was surrounded by crop land. Forest means case that ASOS point was surrounded by trees including mountains. Mixed means case that two or more of the three are mixed (Figure 1). And ASOS collects meteorological data, including temperature, air pressure, wind, humidity, evapotranspiration, and snow data, from every ASOS point at certain times throughout the day. ASOS Ta, Ta quality, latitude, and longitude were collected on an hourly basis from 2014 to 2017. We used ASOS Ta data as input and validation data and this allowed the data to be matched with other datasets, including that of Landsat-8. And we also used ASOS data from 2018 to 2019 for testing applicability of model which was made in this study.

Landsat-8 Data
Landsat satellites have been conducting remote sensing missions since 1972. Landsat-8 observes the earth in near-polar orbit and its temporal resolution is longer the satellite in geostationary orbit. The temporal resolution of Landsat-8 is 16 days, this means retrieves data over the entire Earth in 16 days using two sensors: the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS) and the period that satellite observes the same area is 16 days. The OLI has 30-m resolution for surface reflectance and the TIRS

Ta Data
In South Korea, the Korea Meteorological Administration operates an in-situ data collecting system, the Automated Synoptic Observing System (ASOS). ASOS has 102 locations throughout South Korea which have various land types, thus, we divided the land type into four categories, urban, plain, forest, mixed. Urban means case that ASOS point was surrounded by urban structures, such as building, factory, road. Plain means case that ASOS point was surrounded by crop land. Forest means case that ASOS point was surrounded by trees including mountains. Mixed means case that two or more of the three are mixed ( Figure 1). And ASOS collects meteorological data, including temperature, air pressure, wind, humidity, evapotranspiration, and snow data, from every ASOS point at certain times throughout the day. ASOS Ta, Ta quality, latitude, and longitude were collected on an hourly basis from 2014 to 2017. We used ASOS Ta data as input and validation data and this allowed the data to be matched with other datasets, including that of Landsat-8. And we also used ASOS data from 2018 to 2019 for testing applicability of model which was made in this study.

Landsat-8 Data
Landsat satellites have been conducting remote sensing missions since 1972. Landsat-8 observes the earth in near-polar orbit and its temporal resolution is longer the satellite in geostationary orbit. The temporal resolution of Landsat-8 is 16 days, this means retrieves data over the entire Earth in 16 days using two sensors: the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS) and the period that satellite observes the same area is 16 days. The OLI has 30-m resolution for surface reflectance and the TIRS has 100 m spatial resolution for brightness temperature ( Table 1). The United States Geological Survey processes TIR data with 30 m spatial resolution through resampling [24]. To calculate the variables of Landsat-8 data, we used L2 data and top-of-canopy reflectance data, referring to the surface reflectance data-applied atmospheric correction of OLI data used to remove atmospheric effects, which is important for land surface observations [25,26]. In addition, we used L1 top-of-atmosphere data from the TIRS. The brightness temperature of TIRS and the reflectance of OLI were used by calculating binary data arriving from satellites. Therefore, we thought that verification of the accuracy of the data was secured. We collected 334 scenes each of L1 and L2 data for matching up with ASOS Ta data; notably, the observation time offset was less than 30 min. All Landsat-8 data included reflectance or brightness temperature information and QA (Quality Assurance) information. The QA information was made with 16 bits, and you can check the condition of the pixel by interpreting each bit. Therefore we were checked to ensure good and clear conditions. [27]

Methods
This work aimed to apply satellite data and in-situ data to the DNN model to produce an optimal DNN model for Ta retrieval ( Figure 2). First, we conducted a quality check of the collected Landsat-8 data. And then, we used Landsat-8 data for application to the DNN model. All data of this study have clear weather condition and good observation condition. For this reason, we needed to check the condition of all data. Therefore we checked quality before we retrieved NDVI and NDWI in good quality and we set Band10, Band11, NDVI and NDWI to the DNN model. ASOS Ta was served both temperature, QA, spatial information such as latitude and longitude. Therefore, we check quality with QA data and we set Ta data to the DNN model.
Satellite-based Landsat-8 is available in "Scene" form; however, ASOS Ta assumes the "Point" form, as it is ground truth data. For this reason, we performed spatial matchup based on the ASOS Ta data to produce optimized data for input into the DNN model. The characteristics of satellite observations are such that temporal and spatial matching is essential. Therefore, the nearest ASOS Ta data were obtained based Landsat-8 observed the research area and data of the difference in observation time within 30 min. Because Landsat-8 observes South Korea at 11:00 (KST) every 16 days, we collected ASOS Ta data from 2014 to 2017 to obtain as much data as possible. Spatial matching of the Landsat-8 and ASOS Ta data required calculating the Great Circle Distance (GCD) method with latitudes and longitudes of Landsat-8 and ASOS points. As a result, all inputs had complete spatiotemporal matching of the dependent variables of ASOS Ta and the input variables of Band 10, Band 11, NDVI, and NDWI, with a total of 3139 produced data.
A DNN performs verifications of the application results by randomly dividing all input variables by 70% and 30% ratios. Using 70% of the data, the application of the DNN model first tested the conditions that could affect the model outcome, including various input variables, nodes, batch sizes, and epochs, and determined the optimal conditions with the remaining 30% of the data, in an attempt to yield the most accurate Ta retrieval. In validation, we validated throughout spatial matching the retrieved Ta and ASOS Ta data with each latitude and longitude. After validation of each condition of models, we finally selected the most accurate DNN model in this study. And we additionally validated the result with various step. The first step was validated in condition of each used variable and the second step was seasonal validation, the third step was point validation which means we validated the result in various ASOS point, not combined results of all points. ASOS Ta was served both temperature, QA, spatial information such as latitude and longitude. Therefore, we check quality with QA data and we set Ta data to the DNN model.
Satellite-based Landsat-8 is available in "Scene" form; however, ASOS Ta assumes the "Point" form, as it is ground truth data. For this reason, we performed spatial matchup based on the ASOS Ta data to produce optimized data for input into the DNN model. The characteristics of satellite observations are such that temporal and spatial matching is essential. Therefore, the nearest ASOS Ta data were obtained based Landsat-8 observed the research area and data of the difference in observation time within 30 min. Because Landsat-8 observes South Korea at 11:00 (KST) every 16 days, we collected ASOS Ta data from 2014 to 2017 to obtain as much data as possible. Spatial matching of the Landsat-8 and ASOS Ta data required calculating the Great Circle Distance (GCD) method with latitudes and longitudes of Landsat-8 and ASOS points. As a result, all inputs had complete spatiotemporal matching of the dependent variables of ASOS Ta and the input variables

Deep Neural Network
DNN is a type of machine learning in which computers form large ANNs that resemble the human brain. Deep learning provides learning algorithms and an ever-increasing amount of data to large ANNs, thus allowing for continuous improvement in their ability to "think" and "learn" data. And ANN is used various studies using satellite data in remote sensing, for example, it was used to retrieve the surface rain rate [28][29][30][31].
The word "deep" in DNN refers to the multiple layers of neural networks that accumulate over time. Deep neural networks result in better performance. Deep learning is an artificial intelligence technology that combines more active unsupervised learning with conventional supervised learning that requires human intervention, allowing computers to learn on their own like humans. Technically, deep learning is a collection of machine learning that teaches computers how to think like people.
DNNs are the new norm in signal and data processing, achieving state-of-the-art performance in image, audio, and natural language understanding; additionally, they are Remote Sens. 2021, 13, 4334 6 of 17 used to enhance remote sensing techniques [32]. Deep learning performs well in various kinds of pattern recognition, based on classification and retrieval [33]. Here, DNN learning should help to resolve the correlation between Ta and data obtained from Landsat-8.

Model Condition Selection
Various factors can affect the DNN model. In this study, the optimal model was selected by comparing the differences in accuracy according to input variables, nodes, batch size, and epoch. Verification for optimal model selection was performed by randomly dividing the entire dataset into 70% and 30% portions for use as data for Ta production and verification, respectively.

Variables
In our DNN model, ASOS Ta was the dependent variable, and Band 10, Band 11, NDVI, and NDWI were applied as the independent variables. Changes in the independent variables affect the accuracy of the results obtained using the model. Therefore, we tested various cases of individual variables to establish the relevance of each variable. Figure 3 shows result of correlation between Ta and each variable. To test each variable, we made 4 DNN model and Ta was applied as a dependent variable and each of Band 10, Band 11, NDVI and NDWI was applied to the model as an independent variable. We set DNN condition with 32 batch size, epoch 100, node 50 to test in same condition. And we used Colaboratory of Google, its specifications are T4 GPU, 25.51 GB RAM. We tested availability of each variable; Band 10 and Band 11 and NDVI and NDWI. Generally, the atmospheric window is assumed to range from 8 to 12 µm; in this wavelength range, there is little absorption by the atmosphere such that it easily reaches the Earth's surface unless absorbed by cloud. LST, as mentioned earlier, is calculated by detecting the energy of 11 and 12 µm released from the Earth's surface to satellites, i.e., the split window method [13,[34][35][36]. Emissivity must be considered, given that an uncertainty of 1% in emissivity results in an error of 0.5 K in the calculation of the LST [37]. [38]  Generally, the atmospheric window is assumed to range from 8 to 12 µm; in this wavelength range, there is little absorption by the atmosphere such that it easily reaches the Earth's surface unless absorbed by cloud. LST, as mentioned earlier, is calculated by detecting the energy of 11 and 12 µm released from the Earth's surface to satellites, i.e., the split window method [13,[34][35][36]. Emissivity must be considered, given that an uncertainty of 1% in emissivity results in an error of 0.5 K in the calculation of the LST [37]. Ref. [38] proposed the method to derive a log-linear relation between NDVI and we calculated emissivity using the log-linear relationship between the NDVI and ε11 from previous studies [39][40][41][42]. Emissivity of 11 µm, ε11, as given below: The Band 10 and Band 11 represented the emissivity of the blackbody, which differs from the actual emissivity from the Earth. For this reason, Bands 10 and 11 were used by multiplying emissivity in this study. And we applied Band 10 and Band 11 with Ta to the DNN model. The result of Band 10 was shown in Figure 3a and the R value was 0.96, and the root mean square error (RMSE) was 2.73 K and bias was 0.4819 K. The result of Band 11 was shown in Figure 3b. The R value was 0.97, and the RMSE was 2.33 K and bias was −0.0791 K. These results were shown that Band 10 and Band 11 were highly correlated with Ta.
NDVI uses the difference in red (band4, 0.64~0.67 µm) and NIR (band5, 0.85~0.88 µm) reflectance readings to distinguish dense and sparse vegetation coverage. Dense (sparse) vegetation coverage shows high (low) NIR reflectance, which affects the NDVI (Equation (2)). Thus, the NDVI is used to resolve land surface changes as they relate to global change, the carbon cycle, land coverage/use, and terrestrial ecology research [43]. The NDVI was obtained from data collected by the OLI from two bands (Band 2 and Band 4); each pixel has a 30-m spatial resolution. The relationship between NDVI and LST has been investigated in numerous studies [44]. A previous study [45] investigated LST retrieval algorithms with various NDVI-based land surface emissivity models. From these earlier studies, NDVI and LST are correlated; thus, NDVI should be related to Ta. In this study, the NDVI was calculated for each pixel and the pixel condition was checked with QA data before being included in the calculations: And we applied NDVI with Ta to the DNN model. Figure 3c showed result of Ta and NDVI. The R value was 0.97 and the RMSE was 2.44 K and bias was 0.7418 K. This result means NDVI was highly correlated with Ta.
LST was generally observed to be higher than Ta during the day, as the heat at the Earth's surface was lower than that in the atmosphere [46]. Ta values higher than LST suggest that the satellite observations may have been affected by moisture from cirrus cloud coverage or haze in the air in summer [17,20,47]. In remote sensing, the NDWI represents the surface water condition and is widely used in various methods, such as those described by [48] using NIR and short-wave IR and [49] using green and NIR wavelengths. Our approach to inferring the surface moisture of a pixel is based on Gao's method, as it is commonly used in remote exploration [50,51]. Here, we calculated the NDWI with Band 3 and Band 6 using Gao's method (Equation (3)). NDWI was calculated for each pixel, and the pixel condition was cross-checked against QA data to ensure that the reading coincided with the appropriate NDWI reading: NDWI = (Band 6 − Band 3)/(Band 6 + Band 3) And we applied NDWI with Ta to the DNN model. And result of Ta and NDWI was shown in Figure 3d. The correlation coefficient (R) value was 0.97 and the RMSE was 2.29 K Remote Sens. 2021, 13, 4334 8 of 17 and bias was −0.0427 K. Band 10, Band 11, NDVI and NDWI were highly correlated with Ta in these tests. In all result of them, the R values were over 0.96 and the RMSEs were under 2.73 K. We confirmed availability of these variables, and selected variables as input variables of DNN model in this study.

Batch Size
Batch size, which exploits unit data size in applying the model once, has a significant impact on the accuracy of the DNN model. A batch size that is too small will result in a model that overfits the data, showing a high RMSE (Figure 4). We tested the DNN model with batch sizes applied as a multiple of 2. The input variables from the LST, NDVI, and NDWI data were identified in the previous step. We confirmed an optimal batch size of 256 over the range of batch sizes tested from 32 to 1024, with epoch 10 conditions. Given a batch size of 256, the R value was over 0.94 and the lowest RMSE was 2.35 K.

Selected Model
In the previous steps, we selected the best conditions for the DNN model, four independent variables (Band 10, Band 11, NDVI, and NDWI), and a batch size of 256 for the highest accuracy. Using these conditions, we constructed a DNN model with 70% of the total matchup data, which were selected randomly, and validated the model with the remaining 30% ( Figure 5). We achieved an R value of 0.98 and an RMSE of 2.19 K; the mean absolute error (MAE) was 1.75 K for the DNN model, such that it exceeded the accuracy of the other conditions combined. Compared with the results of previous studies, our results showed better accuracy ( Table 2). An MAE value of 2.21 K was found by [19] compared with 1.75 in this study. [17] estimated Ta with 3.30 K of RMSE, lower in accuracy than this study. In [18], the RMSE and MAE were calculated for day and night conditions; our results showed higher accuracy than those of [19] in both RMSE and MAE. [20] estimated Ta with water conditions using multiple regression, showing an RMSE of 2.89 K, the highest accuracy of the earlier studies; however, the RMSE was higher by about 0.7 K compared with our results.

Selected Model
In the previous steps, we selected the best conditions for the DNN model, four independent variables (Band 10, Band 11, NDVI, and NDWI), and a batch size of 256 for the highest accuracy. Using these conditions, we constructed a DNN model with 70% of the total matchup data, which were selected randomly, and validated the model with the remaining 30% ( Figure 5). We achieved an R value of 0.98 and an RMSE of 2.19 K; the mean absolute error (MAE) was 1.75 K for the DNN model, such that it exceeded the accuracy of the other conditions combined. Compared with the results of previous studies, our results showed better accuracy ( Table 2). An MAE value of 2.21 K was found by [19] compared with 1.75 in this study. [17] estimated Ta with 3.30 K of RMSE, lower in accuracy than this study. In [18], the RMSE and MAE were calculated for day and night conditions; our results showed higher accuracy than those of [19] in both RMSE and MAE. [20] estimated Ta with water conditions using multiple regression, showing an RMSE of 2.89 K, the highest accuracy of the earlier studies; however, the RMSE was higher by about 0.7 K compared with our results.

Spatial Representativeness
Because Ta is readily altered by convection currents, the LST has a higher standard error if the Ta and LST have the same values. Conversely, surface heat transfer cannot be determined easily because there is no transmitter [52]. In a previous study, the deviation in Ta was about 0.6 K when the horizontal distance was 6 km [53]. Here, we tested 25 additional cases, varying in window size from 1 × 1 to 297 × 297, which corresponded to areas of 30 m × 30 m and 9 km × 9 km, respectively ( Figure 6). We checked the changes in RMSE and bias for spatial representativeness. As the window size increased, the RMSE decreased up to a 231 × 231 window size (1.95 K). There was little difference in the RMSE after reaching a window size of 132 × 132. The difference between the RMSE values of the 231 × 231 and 132 × 132 window sizes was 0.0102 K. This variability was similar for the bias. As the window size increases, the bias decreases. The lowest bias was shown for the 132 × 132 window size. As a result, we considered that the spatial representativeness was optimal for the 132 × 132 window size, representing 4 km × 4 km, with an RMSE of 1.96 K and a bias of −0.00856 K.

Selected Model
In the previous steps, we selected the best conditions for the DNN model, four independent variables (Band 10, Band 11, NDVI, and NDWI), and a batch size of 256 for the highest accuracy. Using these conditions, we constructed a DNN model with 70% of the total matchup data, which were selected randomly, and validated the model with the remaining 30% ( Figure 5). We achieved an R value of 0.98 and an RMSE of 2.19 K; the mean absolute error (MAE) was 1.75 K for the DNN model, such that it exceeded the accuracy of the other conditions combined. Compared with the results of previous studies, our results showed better accuracy (Table 2). An MAE value of 2.21 K was found by [19] compared with 1.75 in this study. [17] estimated Ta with 3.30 K of RMSE, lower in accuracy than this study. In [18], the RMSE and MAE were calculated for day and night conditions; our results showed higher accuracy than those of [19] in both RMSE and MAE. [20] estimated Ta with water conditions using multiple regression, showing an RMSE of 2.89 K, the highest accuracy of the earlier studies; however, the RMSE was higher by about 0.7 K compared with our results.

Spatial Representativeness
Because Ta is readily altered by convection currents, the LST has a higher standard error if the Ta and LST have the same values. Conversely, surface heat transfer cannot be determined easily because there is no transmitter [52]. In a previous study, the deviation in Ta was about 0.6 K when the horizontal distance was 6 km [53]. Here, we tested 25 additional cases, varying in window size from 1 × 1 to 297 × 297, which corresponded to areas of 30 m × 30 m and 9 km × 9 km, respectively ( Figure 6). We checked the changes in RMSE and bias for spatial representativeness. As the window size increased, the RMSE decreased up to a 231 × 231 window size (1.95 K). There was little difference in the RMSE after reaching a window size of 132 × 132. The difference between the RMSE values of the 231 × 231 and 132 × 132 window sizes was 0.0102 K. This variability was similar for the bias. As the window size increases, the bias decreases. The lowest bias was shown for the 132 × 132 window size. As a result, we considered that the spatial representativeness was optimal for the 132 × 132 window size, representing 4 km × 4 km, with an RMSE of 1.96 K and a bias of -0.00856 K. Figure 6. Results of RMSE and BIAS according to windows size for spatial representation verification.

Analysis of Individual Variables
It was necessary to identify changes in the estimated temperature of the calculation as the variables changed. We tested the Ta difference between the retrieved Ta and the reference Ta from ASOS data for each variable (Figure 7). When comparing the Ta Figure 6. Results of RMSE and BIAS according to windows size for spatial representation verification.

Analysis of Individual Variables
It was necessary to identify changes in the estimated temperature of the calculation as the variables changed. We tested the Ta difference between the retrieved Ta and the reference Ta from ASOS data for each variable (Figure 7). When comparing the Ta difference, changes were identified for Band 10, Band 11, NDVI, and NDWI.
In Figure 7, the Ta difference showed little change as the individual variables were varied. The slope of Band 10 was −0.015, and the slope of Band 11 was −0.0031. Thus, the changes in Bands 10 and 11 did not affect the accuracy of the temperature calculation (Figure 7a,b). A similar pattern occurred for the NDVI and NDWI with respect to Band 10 and Band 11 (Figure 7c,d). Their slopes were 0.4948 and 0.2225. The slopes of NDVI and NDWI were larger than those of the IR bands; however, from the small amount of changes in the NDVI and NDWI, it was determined that the change in temperature was also very small. Because Ta exists in the form of fluids, it is necessary to verify the accuracy of seasonal Ta due to the relatively smooth delivery of temperatures to the surrounding area compared to surface temperatures. In this study, additional verification was performed for each season with RMSE and bias. Each season was divided into March, April, and May (MAM) for the spring period; June, July, and August (JJA) for the summer period; September, October, and November (SON) for the autumn period; and December, January, and February (DJF) for the winter period. There were 268, 146, 285, and 225 verified data for MAM, JJA, SON, and DJF, respectively. There was a relatively small amount of data in summer due to the presence of clouds during the rainy season and typhoon season in Korea.
The In Figure 7, the Ta difference showed little change as the individual variables were varied. The slope of Band 10 was −0.015, and the slope of Band 11 was −0.0031. Thus, the changes in Bands 10 and 11 did not affect the accuracy of the temperature calculation (Figure 7a,b). A similar pattern occurred for the NDVI and NDWI with respect to Band 10 and Band 11 (Figure 7c,d). Their slopes were 0.4948 and 0.2225. The slopes of NDVI and NDWI were larger than those of the IR bands; however, from the small amount of changes in the NDVI and NDWI, it was determined that the change in temperature was also very small.
Because Ta exists in the form of fluids, it is necessary to verify the accuracy of seasonal Ta due to the relatively smooth delivery of temperatures to the surrounding area compared to surface temperatures. In this study, additional verification was performed for each season with RMSE and bias. Each season was divided into March, April, and May (MAM) for the spring period; June, July, and August (JJA) for the summer period; September, October, and November (SON) for the autumn period; and December, January, and February (DJF) for the winter period. There were 268, 146, 285, and 225 verified data for MAM, JJA, SON, and DJF, respectively. There was a relatively small amount of data in summer due to the presence of clouds during the rainy season and typhoon season in Korea.

Time Series Analysis
The selected model showed a high accuracy, with an R value of 0.98, an RMSE of 2.19 K, and a bias of 0.0837 K. Next, we attempted to apply our DNN model to other data. The data chosen were from a different period, 2018-2019, obtained from six points (Figure 7). To provide a good test, the data were spread throughout the Republic of Korea. The data used corresponded to good and clear conditions, similar to the data used in the QA procedure described in Section 5.1.1.
The first was ASOS point data from Seoul, an urban area surrounded by buildings and roads. The model showed high accuracy, with an R value of 0.98, an RMSE of 1.75 K, a bias of -0.2338 K, and a temperature residual under 4 K (Figure 9a).

Time Series Analysis
The selected model showed a high accuracy, with an R value of 0.98, an RMSE of 2.19 K, and a bias of 0.0837 K. Next, we attempted to apply our DNN model to other data. The data chosen were from a different period, 2018-2019, obtained from six points (Figure 7). To provide a good test, the data were spread throughout the Republic of Korea. The data used corresponded to good and clear conditions, similar to the data used in the QA procedure described in Section 5.1.1.
The first was ASOS point data from Seoul, an urban area surrounded by buildings and roads. The model showed high accuracy, with an R value of 0.98, an RMSE of 1.75 K, a bias of −0.2338 K, and a temperature residual under 4 K (Figure 9a).
The second point was located in Incheon within 500 m of the sea, close to ports and industrial areas. We attempted to check the data in a low NDVI and high NDWI area. In this location, our model showed an R value of 0.98, an RMSE of 2.21 K, a bias of 0.2472 K, and a temperature difference of less than 5 K (Figure 9b).
The Busan point results are shown in Figure 9c; the characteristics of this area were mixed between those of the Seoul and Incheon locations. This location was positioned within 500 m of the sea and near a port; thus, it had a low NDVI and high NDWI similar to Incheon. However, Busan is more of an urban area like Seoul, and is surrounded by roads and buildings. Here, the model showed an R value of 0.97, an RMSE of 2.41 K, and a bias of −1.22 K. This point had lower accuracy than any other point.
The fourth result was for Gimhae, located in an agricultural and residential area (Figure 10a). This point showed greater variation in the NDVI. Similar to the other points, the R value was 0.99; however, the RMSE was 1.80 K, and the bias was −0.0768 K. Thus, our DNN model was capable of resolving changes in the NDVI.   The fifth point was located in Seosan, which is surrounded by forest (Figure 10b). This point represented the characteristics of a high NDVI. The R value was 0.98, the RMSE was 1.86 K, and the bias was −0.5173 K. Thus, our DNN model appears to be useful for forest areas.
The last point was in Yangpyeong, a residential area (Figure 10c). The R was 0.99, the RMSE was 2.00 K, and the bias was 0.9749 K. Thus, Ta was retrieved with high accuracy.
We applied the data from the six points to our DNN model and obtained a very high R value of over 0.97 and a very low RMSE of under 2.41 K. The values were similar to the validated results when constructing the DNN model, confirming the usability of the model.

Conclusions and Discussion
Ta is one of the key parameters representing Earth's energy cycle; thus, accurate calculation of Ta is important. And Ta was used in climate model as input data generally. However, it is physically impossible to produce a large area of Ta. Ta is normally measured directly and it can represent a narrow area. But satellite observation in remote sensing in performed in a large area. But satellite observation retrieve LST instead of Ta because it cannot observe Ta directly. For these reasons, we thought if we could retrieve Ta by satellite data, retrieved Ta represent a large area and it can be widely used in various fields like climate model. In previous studies, they studied Ta retrieval methods using LST data from satellite [17][18][19][20]. In those study, spatial resolution of the result is about 1 km and multiple regression methods were used. We wanted to retrieve Ta with high spatial resolution and used different method, thus, we studied the method using DNN with Landsat-8 data which have 30 m spatial resolution. Because USGS does not offer official LST data of Landsat-8 with high accuracy using validation, we needed to estimate Ta using the correlation between Landsat-8 other data and the measured Ta.
To confirm correlation Ta with other variables with application to DNN model, we tested 4 Landsat-8 data, such as Band 10, Band 11, NDVI, NDWI. Ta was correlated with Band 10 Band 11, which are essential in retrieving the LST; thus, these bands were expected to be highly correlated with Ta. As a result of Band 10 and Band 11, Band 10 showed 0.96 R value, 2.73 K RMSE value, Band 11 showed 0.97 R value, 2.33 K RMSE value. We confirmed both Band 10 and Band 11 are correlated with Ta, and Band 11 is correlated with Ta more than Band 10. And We checked the correlation of Ta with vegetation using the NDVI and confirmed that it was correlated with the moisture state of the indicator and the atmosphere using the NDWI. NDVI showed 0.97 R value, 2.44 K RMSE value and NDWI showed 0.97 R value, 2.29 K RMSE value. it means NDVI and NDWI are correlated with Ta. In previous study, the results were shown about 3 K RMSE value, we thought our 4 variables showed under 3K RMSE value and could be used to apply DNN model of Ta retrieval in this study [17][18][19][20]. These results imply that the variables chosen for our DNN model play an important role in Ta estimation and are conducive to increasing the accuracy of Ta measurements.
After establishing the relevance of these variables, we applied them to a DNN model, and tested our DNN model to determine the optimal conditions. Our model showed an R value of 0.98, an RMSE of 2.19 K, and a bias of 0.0837. This result indicated a higher accuracy for our proposed model compared with those of previous studies as I mentioned [17][18][19][20]. Therefore, we thought the retrieved Ta can represent a large area which is same as Landsat-8 observation area.
After retrieving proposed model, we tested spatial representativeness, as a result, the result of 132 × 132 window size showed the lowest bias value and we confirmed the model represent well in about 4 km × 4 km area. And then we tested further, validation with 4 seasons and the result showed the highest RMSE value is about 2.8 K in spring and each accuracy maintained in every season. And then we tested additionally applicability of the model with data of different periods to several points. To check the accuracy of the model, we applied data from six ASOS point locations using data from 2018 to 2019 to the DNN model which made by data from 2014 to 2017. Because each point had different land condition, thus, it was useful to check the accuracy of the model for the various conditions presented by the six locations. The results of applicability were similar to those obtained from 2014 to 2017, with a minimum R of 0.97 and a maximum RMSE of 2.41 K.
After all test, we confirmed the accuracy of retrieved Ta and representativeness and applicability. As a result, we also confirmed Ta can be retrieved with satellite data with high spatial resolution. This results also mean the retrieved Ta can be used climate models or other studies which use high spatial resolution data. For this reasons, we can expect that the retrieved Ta can be used to analyze urban problems which are very sensitive to temperature such as urban heat islands.
In addition, this study was conducted only on the Korean Peninsula to facilitate Ta data acquisition; however, further research on various regions will be possible in the future, which will strengthen the model with regard to its application over a broader area. And we thought another study could be performed with another satellite data which has high spatial resolution.