Sensitivity Analysis and Validation of Daytime and Nighttime Land Surface Temperature Retrievals from Landsat 8 Using Di ﬀ erent Algorithms and Emissivity Models

: Land Surface Temperature (LST) is a substantial element indicating the relationship between the atmosphere and the land. This study aims to examine the e ﬃ ciency of di ﬀ erent LST algorithms, namely, Single Channel Algorithm (SCA), Mono Window Algorithm (MWA), and Radiative Transfer Equation (RTE), using both daytime and nighttime Landsat 8 data and in-situ measurements. Although many researchers conducted validation studies of daytime LST retrieved from Landsat 8 data, none of them considered nighttime LST retrieval and validation because of the lack of Land Surface Emissivity (LSE) data in the nighttime. Thus, in this paper, we propose using a daytime LSE image, whose acquisition is close to nighttime Thermal Infrared (TIR) data (the di ﬀ erence ranges from one day to four days), as an input in the algorithm for the nighttime LST retrieval. In addition to evaluating the three LST methods, we also investigated the e ﬀ ect of six Normalized Di ﬀ erence Vegetation Index (NDVI)-based LSE models in this study. Furthermore, sensitivity analyses were carried out for both in-situ measurements and LST methods for satellite data. Simultaneous ground-based LST measurements were collected from Atmospheric Radiation Measurement (ARM) and Surface Radiation Budget Network (SURFRAD) stations, located at di ﬀ erent rural environments of the United States. Concerning the in-situ sensitivity results, the e ﬀ ect on LST of the uncertainty of the downwelling and upwelling radiance was almost identical in daytime and nighttime. Instead, the uncertainty e ﬀ ect of the broadband emissivity in the nighttime was half of the daytime. Concerning the satellite observations, the sensitivity of the LST methods to LSE proved that the variation of the LST error was smaller than daytime. The accuracy of the LST retrieval methods for daytime Landsat 8 data varied between 2.17 K Root Mean Square Error (RMSE) and 5.47 K RMSE considering all LST methods and LSE models. MWA with two di ﬀ erent LSE models presented the best results for the daytime. Concerning the nighttime accuracy of the LST retrieval, the RMSE value ranged from 0.94 K to 3.34 K. SCA showed the best results, but MWA and RTE also provided very high accuracy. Compared to daytime, all LST retrieval methods applied to nighttime data provided highly accurate results with the di ﬀ erent LSE models and a lower bias with respect to in-situ measurements. atmospheric transmittance varied between 0.51 to 0.96 with a mean value of 0.83, while mean upwelling and downwelling radiances were 1.37 W · m − 2 · sr − 1 · µ m − 1 and 2.20 W · m − 2 · sr − 1 · µ m − 1 . These mean values were utilized in the sensitivity analyses. A brightness temperature range from 270 K to 295 K was investigated since the brightness temperature computed from the nighttime Landsat scenes varied from 267.77 K to 297.22 K. LSE value equal to 0.97 was assumed.


Introduction
Land Surface Temperature (LST), also named skin temperature, refers to the surface temperature of the Earth. The International Geosphere and Biosphere Program (IGBP) [1] accepted the LST as one of the high-priority parameters, and the Global Climate Observing System (GCOS) [2] identified it as an Essential Climate Variable (ECV). Considering space-borne, airborne, and ground-based remote sensors, LST represents the accumulative radiometric surface temperature of all materials of the surface cover covering the sensor's field of view in the observation direction [3]. Thus, LST estimation from Thermal Infrared (TIR) images is a complicated procedure since the Earth's surface is composed of dissimilar materials of varying geometry [4][5][6]. For example, the LST pixel of a densely vegetated area represents the surface temperature of vegetation; however, for a sparsely vegetated area, the surface temperature of vegetation and soil together comprises the LST of the area [5].
The history of satellite-derived LST goes back to TIROS-II satellite, which was launched at the beginning of the 1960s [29,30]. Through meteorological stations, surface temperature estimation from radiance measurements is a classical point-based technique; nevertheless, this technique does not stand for the LST on a large scale. To overcome this drawback, spaceborne TIR remote sensing has been extensively examined for LST retrieval, and regional and global scale monitoring is the main advantage of this technology. However, surface parameters (emissivity and geometry), sensor parameters (spectral range and viewing angle), and atmospheric effects are the major factors that influence the accuracy of the LST retrieval from TIR data of satellites [5,29,[31][32][33]. Thus, accurate estimation of Land Surface Emissivity (LSE) and atmospheric parameters is a crucial procedure to obtain LST from TIR data [34]. Concerning these parameters, various TIR-based multi-channel and single-channel LST retrieval methods have been proposed by the researchers for different sensor types. Namely, these are Temperature-Independent Spectral Indices (TISI) method [35], Split Window Algorithm (SWA) [36][37][38], Mono Window Algorithm (MWA) [39], Single Channel Algorithm (SCA) [40,41], Radiative Transfer Equation (RTE) [42,43], and Temperature and Emissivity Separation (TES) method [44]. Among the LST retrieval methods above, only SWAs do not need atmospheric parameters such as water vapor profile and/or temperature. The LSE and LST errors arising from the other algorithms largely rely on the input atmospheric profile's uncertainties [45].
There are numerous Earth observation sensors, namely, Geostationary Operational Environmental Satellite (GOES), Moderate Resolution Imaging Spectroradiometer (MODIS), Advanced Along-Track Scanning Radiometer (AATSR), The Spinning Enhanced Visible and Infrared Imager (SEVIRI), The Advanced Very High Resolution Radiometer (AVHRR), The Visible Infrared Imaging Radiometer Suite (VIIRS), and Sentinel-3, providing operational daytime and nighttime LST products with low spatial resolution (from 750 m to 4 km). However, TIR data of Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and Landsat satellite series have higher spatial resolution but lower temporal resolution than the sensors reported above. Regarding these limitations, LST retrieval having both high temporal and spatial resolution is a challenge for thermal remote sensing studies. However, LST images obtained from Landsat and ASTER TIR data are unique sources to investigate the thermal environment of cities and their surroundings due to the higher spatial resolution in TIR bands. Moreover, Landsat-derived LST is one of the most commonly preferred data for various applications stated above.
The demand for satellite-based LST products has been increasing rapidly. Thus, the quality of the LST data used in the studies should be examined by a validation procedure for accurate and reliable analyses. Validation provides information about the quantitative uncertainty, enabling the proper use and application of the product. Thus, any algorithm or product would not be broadly welcomed without performing thorough calibration and validation [46]. Overall, cross-validation, the Temperature-based method (T-based) and the Radiance-based method (R-based) are three main techniques considered to evaluate space-based LST [31,34]. Many researchers have considered one or two of these methods for satellite-based LST validation derived from Landsat missions [34,[47][48][49][50][51][52][53][54][55], Sentinel-3A [56], GOES [57], SEVIRI [58,59], MODIS [60][61][62], AATSR [58,62,63], VIIRS [64], ASTER [65,66] and AVHRR [38]. In this work, we utilized the T-based technique for LST validation, and further details about this method are presented in the Methodology Section.
In this study, Landsat 8 data, both daytime and nighttime, were considered for LST retrieval from RTE, SCA, and MWA methods. In the study, SWA was not examined since the USGS do not recommend using Band 11 of Landsat 8 for LST retrieval due to the large calibration uncertainty. Furthermore, we already obtained better results with MWA than with the SWA developed by Mao et al. [36] with coefficients by Yu et al. [51] in our previous research [34]. Considering the literature, in general, researchers have used daytime Landsat data to retrieve LST due to the lack of LSE images in the night. To the best of our knowledge, there is no study published so far that considered nighttime TIR data of Landsat 8 for both retrieval and validation of nighttime LST. Even though the availability of the nighttime Landsat TIR data is limited in time and many researchers are not even aware that Landsat missions acquire nighttime TIR data, it is probable that future Landsat missions may provide much more nighttime TIR data for the sustainability and strength of the scientific studies. As discussed in the previous paper of the authors [34], Normalized Difference Vegetation Index (NDVI)-based LSE retrieval methods are operative and easy to apply for the Landsat data. In this paper, we propose using daytime NDVI-based LSE, whose acquisition is close to nighttime data (the difference ranges from 1 day to 4 days), as an input in the corresponding methods for the nighttime LST retrieval. Besides, the effect of six different NDVI-based LSE models on LST retrieval methods was evaluated for both daytime and nighttime LST analyses. As stated in the day-night algorithm [67], the LSE does not vary dramatically in several days if snow and/or rain does not exist during a short period. Thus, we assumed that the daytime LSE will not change in the night for a few days considering the weather condition of the corresponding time interval. The objectives of this study are to (1) evaluate the efficiency of RTE, MWA, and SCA methods for both daytime and nighttime Landsat 8 data and in-situ measurements, (2) reveal the impact of NDVI-based LSE models on LST retrieval methods for both daytime and nighttime data, (3) encourage the researchers by showing the convenience of the proposed nighttime LST retrieval from Landsat 8 data for the common usage, and (4) provide sensitivity analyses of in-situ measurements and LST retrieval methods for both daytime and nighttime data. Concerning the ground-based LST measurements, upwelling and downwelling thermal radiation measurements were obtained from Atmospheric Radiation Measurement (ARM) and Surface Radiation Budget Network (SURFRAD) stations, established over rural areas, simultaneously with TIR data acquisitions. To carry out the image-processing tasks, we used an automated LST retrieval toolbox, which was provided by the authors for the use of researchers in the previous study [34].

In-Situ LST Measurements and Validation Sites
Surface longwave radiation measurements are important sources for the estimation of in-situ LST and emissivity [65,68]. There are some programs, namely, SURFRAD [69], FLUXNET [70], ARM [71], and Baseline Surface Radiation Network (BSRN) [72] that provide long-term and high-quality surface longwave radiation measurements open to the public. In this study, four stations from SURFRAD and five stations from ARM, nine stations in total ( Figure 1) over rural areas, were utilized to calculate daytime and nighttime in-situ LST simultaneous with TIR data acquisitions. The SURFRAD network was established by National Oceanic and Atmospheric Administration (NOAA) in 1993 to support climate-related research over the United States (US) by providing longterm, continuous, and accurate in-situ surface radiation budget [69]. In 1995, the system started operating with four stations, and now, seven SURFRAD stations have been serving in different climatological regions of the US. The SURFRAD data have been utilized in different studies involving assessment of satellite-based retrievals of surface radiation parameters, climate models, hydrology, and validation of radiation transfer codes and surface physics packages of weather [69]. To calculate in-situ LST, quality-controlled measurements of broadband hemispherical upwelling and downwelling longwave radiation are provided by the SURFRAD stations every 3 min (before 2009) or every minute (after 2009). Many studies have been carried out using SURFRAD measurements to validate LST retrievals from satellites [34,47,54,[73][74][75].
The ARM Program was initially founded in 1989 by the US Department of Energy to examine cloud formation processes. Then, the ARM Climate Research Facility was established in 2003, and this program added further sites and instruments to the available ones as a scientific user facility. All data, providing long-term continuous atmospheric measurements, have been freely available since 2003 (https://www.arm.gov/) [76]. Eastern North Atlantic (ENA), North Slope of Alaska (NSA), and Southern Great Plains (SGP) are three basic ARM sites. In this study, five SGP sites were used for insitu LST retrieval. As in SURFRAD stations, ARM SGP stations provide quality-controlled measurements of upwelling and downwelling longwave radiation for in-situ LST calculation, and many types of research were carried out using these stations [61,[77][78][79]. Table 1 presents detailed information about both ARM SGP sites and SURFRAD sites considered in this study. The SURFRAD network was established by National Oceanic and Atmospheric Administration (NOAA) in 1993 to support climate-related research over the United States (US) by providing long-term, continuous, and accurate in-situ surface radiation budget [69]. In 1995, the system started operating with four stations, and now, seven SURFRAD stations have been serving in different climatological regions of the US. The SURFRAD data have been utilized in different studies involving assessment of satellite-based retrievals of surface radiation parameters, climate models, hydrology, and validation of radiation transfer codes and surface physics packages of weather [69]. To calculate in-situ LST, quality-controlled measurements of broadband hemispherical upwelling and downwelling longwave radiation are provided by the SURFRAD stations every 3 min (before 2009) or every minute (after 2009). Many studies have been carried out using SURFRAD measurements to validate LST retrievals from satellites [34,47,54,[73][74][75].
The ARM Program was initially founded in 1989 by the US Department of Energy to examine cloud formation processes. Then, the ARM Climate Research Facility was established in 2003, and this program added further sites and instruments to the available ones as a scientific user facility. All data, providing long-term continuous atmospheric measurements, have been freely available since 2003 (https://www.arm.gov/) [76]. Eastern North Atlantic (ENA), North Slope of Alaska (NSA), and Southern Great Plains (SGP) are three basic ARM sites. In this study, five SGP sites were used for in-situ LST retrieval. As in SURFRAD stations, ARM SGP stations provide quality-controlled measurements of upwelling and downwelling longwave radiation for in-situ LST calculation, and many types of research were carried out using these stations [61,[77][78][79]. Table 1 presents detailed information about both ARM SGP sites and SURFRAD sites considered in this study. In the validation sites, two pyrgeometers (Eppley Precision Infrared Radiometer) mounted at 10-m height measure the downwelling and upwelling longwave radiation in the spectral range from 4.0 to 50.0 µm. The instruments are exchanged annually with newly calibrated instruments at each station [69] and world-recognized organizations perform these calibrations [65]. The Eppley pyrgeometer has about 4.2 W·m −2 measurement accuracy, and the instrument's precision is around 2 W·m −2 for daytime measurements and less than 1 W·m −2 for nighttime measurements [80]. Furthermore, Guillevic et al. [75] reported that considering the instrumental error, less than 1 K uncertainty is observed from the retrieved LST. In this study, we also conducted sensitivity/uncertainty analyses for both daytime and nighttime in-situ measurements in Section 4.1. The spatial representativeness of the pyrgeometer is about 70 m × 70 m at the surface [34,65], which is appropriate for the Landsat TIR pixel size (100 m native resampled at 30 m by the US Geological Survey) over homogeneous surfaces. Thus, we selected the validation sites whose footprint on Landsat 8 TIR pixel has homogeneous surface cover. On the other hand, many studies have already considered these ARM SGP and SURFRAD stations to validate low-resolution LST products of MODIS, SEVIRI, GOES, VIIRS, and AATSR [46,58,65,81,82]. Therefore, the use of these stations in the validation of Landsat-derived LST products is highly acceptable.

Satellite Data
The Landsat mission has been providing moderate-resolution earth observation data from space regularly for almost 50 years. Landsat 8 was launched on 11 February 2013, and it is the recent operational satellite of the Landsat series. Landsat 4 was the first mission providing one thermal band, and the first TIR data of Landsat 4 dates back to 1982, which makes it possible to study long-term LST variations together with all Landsat missions both at a regional and local scale. The Landsat 8 satellite carries two sensors, namely, the Operational Land Imager (OLI) and the TIR sensor (TIRS). The TIRS sensor has two thermal bands (Band 10 and Band 11), while the OLI sensor has nine reflective bands with 30-m spatial resolution. The native spatial resolution of TIR bands is 100-m; however, USGS publishes them at 30-m by resampling.
In this study, 21 pairs of nighttime and daytime Landsat-8 data (Collection 1) from 2013 to 2019 were utilized for the retrieval of daytime and nighttime LST images. Landsat 8 data were freely obtained through the website of the USGS (https://earthexplorer.usgs.gov/). Band 10 of the TIRS sensor, and Band 4 (Red (R)) and Band 5 (Near Infrared (NIR)) of the OLI sensor, for the estimation of NDVI-based LSE, were used in LST retrieval methods. The quality of the used data was checked by the Pixel Quality Assessment (QA) band that provides information for the exclusion of observations affected by sensor factors, clouds, and cloud shadow [83]. The list of the daytime and nighttime Landsat 8 images with corresponding validation site names are reported in Appendix A.

Satellite LST Retrieval Methods
In this study, the following three commonly used methods for LST retrieval are examined: Radiative Transfer Equation (RTE) method, Single Channel Algorithm (SCA) [40], and Mono Window Algorithm (MWA) [39]. The input atmospheric parameters in the methods, such as downwelling radiance (L ↓ λ ), upwelling radiance (L ↑ λ ), and atmospheric transmittance (τ) were calculated using the Atmospheric Correction Parameter Calculator (ACPC) developed by National Aeronautics and Space Administration (NASA) of the US. ACPC uses the atmospheric profiles analyzed by the National Centers for Environmental Prediction (NCEP) as inputs to the radiative transfer codes for a given site and date to calculate the aforementioned atmospheric parameters [84,85].

Brightness Temperature (Tb) Calculation
The brightness temperature of a target refers to the temperature of a blackbody emitting a similar quantity of radiation at a specific wavelength [86], and inverse solution of the Planck function is the way of calculating it. To obtain the brightness temperature image from TIR data, the first step is converting the Digital Number (DN) values to spectral radiance. This radiance conversion for Landsat 8 TIRs can be applied using Equation (1) [87]: where L sen λ refers to the TOA spectral radiance in Watts/(m 2 ·srad·µm), Q CAL is the calibrated and quantized standard product pixel values (DNs), A L is the additive rescaling factor of the corresponding band, and M L is the multiplicative rescaling factor of the corresponding band. A metadata file of the relevant Landsat 8 data contains the values of these parameters. The brightness temperature for Landsat 8 data can be calculated after radiance conversion using Equation (2): where Tb is the effective at-satellite brightness temperature in Kelvin, K 1 in Watts/(m 2 ·srad·µm) and K 2 in Kelvin refer to the calibration constants. K 1 and K 2 values for the Landsat 8 Band 10 are 774.89 (Watts/(m 2 ·srad·µm)) and 1321.08 K, respectively.

Radiative Transfer Equation Method
The inverse solution of the radiative transfer equation (RTE) is a direct method for LST retrieval using a single TIR band. This inverse solution can be given by the following expressions: where L sen λ (W·m −2 ·sr −1 ·µm −1 ) represents the at-sensor spectral radiance of the corresponding TIR band, ε refers to the LSE, B λ in W·m −2 ·sr −1 ·µm −1 is the blackbody radiance, T s is the LST, L ↓ λ and L ↑ λ represent the downwelling and upwelling radiance, respectively, and τ is the atmospheric transmittance. B λ at a temperature of T s is calculated by the inversion of the Equation (3): and, eventually, T s (LST) can be obtained from the inversion of Planck's law as in Equation (5): where K 1 and K 2 refer to the calibration constants described in the previous section.

Mono Window Algorithm
Qin et al. [39] developed the Mono Window Algorithm (MWA) for the Landsat TM data. Three essential variables, namely, LSE, effective mean atmospheric temperature, and atmospheric transmittance are required for LST retrieval using the MWA method. MWA-based LST can be retrieved by Equation (6): where T a is the effective mean atmospheric temperature in Kelvin, a (−67.355351) and b (0.458606) are constants of the algorithm, C and D are the parameters of the algorithm calculated as C = ε × τ Table 2 provides empirical equations to estimate the T a through air temperature (T o ), since it is an essential parameter of MWA [39]. In this study, T a values were computed for the mid-latitude summer region and T o was obtained from the corresponding validation site. Jiménez-Muñoz et al. [40] proposed a revised version of SCA for LST retrieval using Landsat TIR data. Concerning the SCA, T s is obtained from Equation (7): where ψ 1 , ψ 2 , and ψ 3 refer to atmospheric functions defined as: Concerning the SCA method in this study, L ↑ λ , L ↓ λ , and τ obtained from NASA's ACPC were used for the computation of the ψ 1 , ψ 2 , and ψ 3 . On the other hand, the two parameters, γ and δ, are computed by: where b γ = c 2 /λ i and c 2 = 14,387.7 µm·K, and b γ is equal to 1320 K for Landsat 8 Band 10. λ i is the ith band's effective wavelength given by: where f i (λ) is ith band's spectral response function. λ 1,i and λ 2,i refer to the lower and upper boundary of f i (λ), respectively.

NDVI-Based Land Surface Emissivity (LSE) Models
Emissivity of a surface represents the ability of the surface to transform heat energy, relative to a black body, into radiant energy [88]. As presented in the above sections, LSE (ε) is a critical element for accurate TIR-based LST retrieval. Multi-channel Temperature/Emissivity Separation (TES), Physically Based Methods (PBMs), and Semi-Empirical Methods (SEMs) methods are three main types of space-based LSE estimation [31]. The NDVI-Based Emissivity Method (NBEM) [89,90] and Classification Based Emissivity Method (CBEM) [91,92] constitute the SEMs that are convenient for the Landsat-derived LSE. CBEM is not feasible because of the need of a priori information about the test site and in-situ emissivity of each class [93]. NDVI-based LSE models are practical and frequently used methods due to their easy application providing satisfactory results [88,94,95]. Li et al. [31] introduced a comprehensive research revealing limitations, advantages, and disadvantages of LSE models for satellite-derived LST. Moreover, Sekertekin and Bonafoni [34] provided an updated state-of-the-art table from Li et al. [31], presenting the used satellite missions with the corresponding LSE models. In this study, we examined the influence of six NDVI-based LSE models on the performance of three LST algorithms for both daytime and nighttime. To calculate NDVI from Landsat 8 data, firstly, DN values are converted to the TOA reflectance using the Equation (12) [87]. After applying reflectance (ρ λ ) conversion to the R and NIR bands, NDVI is obtained from Equation (13). Specifically: where Q CAL is the calibrated and quantized standard product pixel values (DNs), A p is the additive rescaling factor of the corresponding band, M p is the multiplicative rescaling factor of the corresponding band, and θ SE represents the local sun elevation angle. The values of these parameters are obtained from the Metadata file of the relevant Landsat 8 data.
where ρ NIR refers to the reflectance image of the NIR band and ρ R is the reflectance image of the R band.
In addition to NDVI, the Fractional Vegetation Cover (FVC or P v ), i.e., the proportion of vegetation, is another important factor for LSE estimation, and it is calculated from Equation (14) [96] as: where NDVI min = 0.2 and NDVI max = 0.5 in a global context [93]. Table 3 presents the expressions of the six NDVI-based LSE models used in this work (hereafter referred to as LSE1, LSE2, . . . , LSE6). More details about these models can be found in the previous paper of the authors [34].
Valor and Caselles [90] ε = Li and Jiang [98] : a term taking the cavity effect into account, which is based on the geometry of the surface. ε s, ε v and F refer to soil emissivity, vegetation emissivity and geometrical shape factor (0.55), respectively. ρ R : the reflectance image of R band; ρ j : the apparent reflectance in the OLI band j; a 1i − a 7i : the coefficients obtained from [98].

In-Situ LST Estimation
Station-based (in-situ or ground-based) LST measurements were obtained from four SURFRAD stations and five ARM SGP stations. As stated in Section 2.1, these stations do not measure LST directly; the upwelling and downwelling components of longwave radiation are considered for LST calculation regarding Stefan-Boltzmann law: where F ↓ λ and F ↑ λ in W/m 2 are the downwelling and upwelling thermal infrared irradiances, respectively, obtained simultaneously with satellite passages. σ is 5.670367 × 10 −8 W·m −2 ·K −4 that refers to the Stefan-Boltzmann constant. ε b is the broadband longwave surface emissivity that is not measured by the station instruments, thus [65,68] proposed the computation of the broadband emissivity by regression from narrowband emissivities of MODIS data, and many studies used these regression equations for acquiring the ε b [52,73,99]. The experimental results in [65,68] revealed that the longwave broadband emissivity can be used as a fixed value of 0.97, which was also considered in the studies of [74,100]. In this study, we assumed the broadband emissivity as 0.97, as well. This phenomenon only affects the accuracy of in-situ LST, not the satellite-based LST accuracy. Heidinger et al. [74] reported that a 0.01 error in broadband emissivity led to 0.25 K LST error in SURFRAD sites. Furthermore, Wang and Liang [65] showed that the LST accuracy of SURFRAD sites ranged from 0.1 K to 0.4 K due to the ±0.01 error in the broadband emissivity. This error is not negligible; however, it is not an overwhelming uncertainty source compared to the magnitude of the other uncertainties in LST retrieval [74]. Concerning this study, we also carried out the uncertainty analysis of broadband longwave surface emissivity and longwave radiation (the downwelling and upwelling components) on ground-based LST measurements in the next section.

Sensitivity Analysis of In-Situ LST Measurements and LST Retrieval Methods
Sensitivity analysis is an application of how the error of a model output (numerical, statistical, or otherwise) can be divided and allocated to different uncertainty sources in the model inputs [101]. It is difficult to determine the inputs of an algorithm, since these inputs unavoidably have initial errors affecting the accuracy of the LST retrieval methods [34,49]. To investigate the effect of input parameters' errors on LST retrievals from both satellites and stations, the following equation is utilized: where δT is the error on the LST; x represents one of the input parameters and δx is the potential error of this parameter; T s (x + δx) and T s (x) refer to the LST calculated for "x + δx" and "x", respectively. Some researchers reported the uncertainty of the input parameters on LST retrieval algorithms [49,102,103]. On the other hand, concerning the sensitivity analysis of in-situ LST measurements, [65,74] investigated the sensitivity of SURFRAD LST to broadband emissivity. In the previous paper of the authors [34], we already presented detailed sensitivity analysis for daytime LST retrieval considering MWA, SCA, and RTE. In this study, we mainly focused on the effect of LSE on LST retrieval methods for both daytime and nighttime LST retrievals, since we proposed using the daytime LSE images for nighttime LST retrieval. Furthermore, we also conducted a comprehensive sensitivity analysis for the in-situ LST measurements that is presented in Section 4.1.

Temperature-Based (T-Based) Validation Method and Performance Metrics
As stated in the introduction, the Radiance-based method (R-based), Temperature-based method (T-based), and cross-validation are the main techniques used to evaluate space-based LST [31,34].
The T-based technique, examined in this research, is a direct way of comparing the satellite-derived LST with in-situ LST simultaneous with satellite pass, and many researchers used this way to validate satellite-derived LSTs [48,52,62,104,105]. The major benefit of the T-based method is that it makes it possible to evaluate satellite sensor's radiometric quality and the efficiency of the LST algorithms based on emissivity and atmospheric parameters. On the other hand, the capability of the T-based technique depends mostly on the accuracy of the in-situ LST measurements and how well they represent the LST at the satellite pixel scale (land cover homogeneity of the study area) [31]. In this study, we considered both issues as we carried out the sensitivity analysis of the SURFRAD LST measurements and selected the validation sites whose footprint on Landsat 8 TIR pixel has homogeneous surface cover.
Satellite-derived LST and Station-based LST were analyzed considering the performance metrics such as Root Mean Square Error (RMSE), Standard Deviation (STD) of Error, and average Bias. The formulas of these metrics are given by: STD of Error = T Error − T Error 2 n (18) where T L8 and T Station are the Landsat 8-derived LST and Station-based LST, respectively, and n refers to the number of data. T Error refers to the difference between Landsat 8-derived LST and Station-based LST, and T Error is the mean value of these differences.

Results
To present the results of the LST retrieval methods for daytime and nighttime, 21 pairs of nighttime and daytime Landsat-8 data were utilized to obtain the daytime and nighttime LST images (see Appendix A). Specifically, concerning the daytime LST, 21 Landsat-8 images were used. On the other hand, 21 nighttime images, whose acquisition times are close to daytime data (the difference ranges from one day to four days), were utilized for the nighttime LST retrieval by using the corresponding 21 daytime reflective data for the NDVI-based LSE computation. We verified that rain and/or snow did not occur during these 1-4 days of difference. MWA, RTE, and SCA were performed for both daytime and nighttime LST estimation considering all datasets. The required input atmospheric parameters in the methods (τ, L ↑ λ , L ↓ λ ) were obtained from ACPC that considers the MODTRAN radiative transfer code, which uses NCEP-based atmospheric profiles as inputs. This section includes two sensitivity analyses: (i) Sensitivity of in-situ LST measurements and (ii) sensitivity of LST retrieval methods to LSE. Lastly, the accuracy assessment of the LST retrieval algorithms and LSE models for both daytime and nighttime at the nine SURFRAD and ARM stations is proposed.

Sensitivity Results of In-Situ LST Measurements
Concerning the in-situ LST measurements utilized in this work, the average upwelling and downwelling radiances, respectively, were calculated as 482.18 W/m 2 and 331.15 W/m 2 for daytime, and 388.16 W/m 2 and 326.68 W/m 2 for nighttime. In addition, as stated in the previous section, we used a fixed broadband emissivity value as 0.97. Thus, these values were considered in the sensitivity analysis of in-situ LST measurements. To carry out a sensitivity analysis of a method's output to an input parameter, the other input parameters are assumed to be fixed. For instance, to manage the sensitivity analysis of the downwelling radiance in the daytime (Figure 2a), the upwelling radiance and the broadband emissivity was fixed to 482.18 W/m 2 and 0.97, respectively. Then, the sensitivity of the downwelling radiance to in-situ LST accuracy was revealed by changing the downwelling radiance at 5 W/m 2 intervals (Figure 2a). The same procedure was applied to present the sensitivity results of the other parameters. As reported in Section 2.1, the two pyrgeometers have an accuracy of about 4.2 W/m 2 and a precision of around 1-2 W/m 2 . Considering the daytime sensitivity results, the following results were obtained: (i) ±5 W/m 2 error in downwelling and upwelling radiance led to ±0.024 K and ±0.8 K error in LST, respectively (Figure 2a,b) and (ii) 0.01 error in the broadband emissivity caused ±0.25 K error in LST (Figure 2c). On the other hand, nighttime sensitivity results showed that (i) ±5 W/m 2 error in downwelling and upwelling radiance led to ±0.029 K and ±0.95 K error in LST, respectively (Figure 2d,e) and (ii) 0.01 error in the broadband emissivity caused ±0.12 K error in LST. It is evident from Figure 2 that the uncertainty of the downwelling and upwelling radiance is almost identical in daytime and nighttime. However, the uncertainty of the broadband emissivity in the nighttime is half of the daytime.

Sensitivity Results of LST Retrieval Methods to LSE
In this sensitivity analysis, we mainly focused on the effect of LSE on LST retrieval methods for both daytime and nighttime LST retrievals, since we proposed using the daytime LSE images for nighttime LST calculation. A detailed uncertainty analysis of all parameters on LST retrieval methods (RTE, SCA, and MWA) for daytime can be found in the previous paper of the authors [34]. In the sensitivity analysis of daytime LST images, the following input parameters were utilized based on the current datasets: Air temperature, upwelling and downwelling radiances, atmospheric transmittance, and effective mean atmospheric temperature. Minimum, maximum, and mean nearsurface air temperature values from ground stations and simultaneous with the satellite passages were 282.51 K, 302.41 K, and 295.95 K, respectively. Thus, the near-surface air temperature was assumed to be 295.95 K in the sensitivity analysis and, as a consequence, the effective mean atmospheric temperature was computed as 290.12 K. The atmospheric transmittance ranged from 0.63 to 0.94 with a mean value of 0.84, which was used in this analysis. Mean downwelling and upwelling radiances were observed as 2.06 W•m −2 •sr −1 •μm −1 and 1.24 W•m −2 •sr −1 •μm −1 , respectively, and these values were utilized in the sensitivity analyses. The brightness temperature range was assumed between 280 K and 310 K, because the brightness temperature computed from the daytime Landsat scenes ranged from 282.66 K to 314.84 K. The LSE value was fixed as 0.97. Figure 3 illustrates the sensitivity results of the LST retrieval methods to LSE under a specific brightness temperature range of daytime. Figure 3a,c,e shows the variations in the error of the LST Figure 2. Sensitivity results of in-situ Land Surface Temperature (LST) measurements to downwelling radiance, upwelling radiance, and broadband emissivity, respectively, for both daytime (a-c) and nighttime (d-f). LST error is computed as in Equation (16).

Sensitivity Results of LST Retrieval Methods to LSE
In this sensitivity analysis, we mainly focused on the effect of LSE on LST retrieval methods for both daytime and nighttime LST retrievals, since we proposed using the daytime LSE images for nighttime LST calculation. A detailed uncertainty analysis of all parameters on LST retrieval methods (RTE, SCA, and MWA) for daytime can be found in the previous paper of the authors [34]. In the sensitivity analysis of daytime LST images, the following input parameters were utilized based on the current datasets: Air temperature, upwelling and downwelling radiances, atmospheric transmittance, and effective mean atmospheric temperature. Minimum, maximum, and mean near-surface air temperature values from ground stations and simultaneous with the satellite passages were 282.51 K, 302.41 K, and 295.95 K, respectively. Thus, the near-surface air temperature was assumed to be 295.95 K in the sensitivity analysis and, as a consequence, the effective mean atmospheric temperature was computed as 290.12 K. The atmospheric transmittance ranged from 0.63 to 0.94 with a mean value of 0.84, which was used in this analysis. Mean downwelling and upwelling radiances were observed as 2.06 W·m −2 ·sr −1 ·µm −1 and 1.24 W·m −2 ·sr −1 ·µm −1 , respectively, and these values were utilized in the sensitivity analyses. The brightness temperature range was assumed between 280 K and 310 K, because the brightness temperature computed from the daytime Landsat scenes ranged from 282.66 K to 314.84 K. The LSE value was fixed as 0.97. Figure 3 illustrates the sensitivity results of the LST retrieval methods to LSE under a specific brightness temperature range of daytime. Figure 3a,c,e shows the variations in the error of the LST under different brightness temperatures for MWA, RTE, and SCA, respectively, when the LSE error is constant. These figures show that when the LSE error is constant for MWA and SCA, LST error increases with increasing brightness temperature. Instead, when the LSE error is constant for RTE, the LST error is stable with increasing brightness temperature. It is important to note that, since the LST error is computed as in Equation (16), an overestimation (underestimation) of the emissivity produces a positive (negative) value in the LST error. Figure 3b,d,f represents how LSE error impacts the LST error for the MWA, RTE, and SCA, respectively, under different brightness temperature conditions. The findings in these figures support the previous ones (Figure 3a,c,e) by showing that a constant LSE error produces LST error variations under different brightness temperature conditions for MWA and SCA, except for RTE. The intercomparison of the results proves that MWA is more sensitive to LSE error than RTE and SCA under increasing brightness temperatures, while RTE is the least sensitive one.
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 25 under different brightness temperatures for MWA, RTE, and SCA, respectively, when the LSE error is constant. These figures show that when the LSE error is constant for MWA and SCA, LST error increases with increasing brightness temperature. Instead, when the LSE error is constant for RTE, the LST error is stable with increasing brightness temperature. It is important to note that, since the LST error is computed as in Equation (16), an overestimation (underestimation) of the emissivity produces a positive (negative) value in the LST error. Figure 3b,d,f represents how LSE error impacts the LST error for the MWA, RTE, and SCA, respectively, under different brightness temperature conditions. The findings in these figures support the previous ones (Figure 3a,c,e) by showing that a constant LSE error produces LST error variations under different brightness temperature conditions for MWA and SCA, except for RTE. The intercomparison of the results proves that MWA is more sensitive to LSE error than RTE and SCA under increasing brightness temperatures, while RTE is the least sensitive one. In the sensitivity analysis of the nighttime LST images, minimum, maximum, and mean nearsurface air temperature values from the ground stations were 271.65 K, 300.75 K, and 291.07 K, respectively. Considering the mean value (291.07 K), the effective mean atmospheric temperature was 285.60 K. The atmospheric transmittance varied between 0.51 to 0.96 with a mean value of 0.83, while mean upwelling and downwelling radiances were 1.37 W•m −2 •sr −1 •μm −1 and 2.20 W•m −2 •sr −1 •μm −1 . These mean values were utilized in the sensitivity analyses. A brightness temperature range from 270 K to 295 K was investigated since the brightness temperature computed from the nighttime Landsat scenes varied from 267.77 K to 297.22 K. LSE value equal to 0.97 was assumed. In the sensitivity analysis of the nighttime LST images, minimum, maximum, and mean near-surface air temperature values from the ground stations were 271.65 K, 300.75 K, and 291.07 K, respectively. Considering the mean value (291.07 K), the effective mean atmospheric temperature was 285.60 K. The atmospheric transmittance varied between 0.51 to 0.96 with a mean value of 0.83, while mean upwelling and downwelling radiances were 1.37 W·m −2 ·sr −1 ·µm −1 and 2.20 W·m −2 ·sr −1 ·µm −1 . These mean values were utilized in the sensitivity analyses. A brightness temperature range from 270 K to 295 K was investigated since the brightness temperature computed from the nighttime Landsat scenes varied from 267.77 K to 297.22 K. LSE value equal to 0.97 was assumed. Figure 4 depicts the sensitivity results of the LST retrieval methods to LSE under a specific brightness temperature range of nighttime. Figure 4a,c,e demonstrates the variations in the LST error under different brightness temperatures for MWA, RTE, and SCA, respectively, when the LSE error is constant. Moreover, Figure 4b,d,f represents how LSE error impacts the LST error for the MWA, RTE, and SCA, respectively, varying the brightness temperature values. The sensitivity analysis of nighttime data shows results with a trend similar to the daytime one; however, the variation in the LST error is smaller than daytime, also considering the lower brightness temperature values in the nighttime.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 25 Figure 4 depicts the sensitivity results of the LST retrieval methods to LSE under a specific brightness temperature range of nighttime. Figure 4a,c,e demonstrates the variations in the LST error under different brightness temperatures for MWA, RTE, and SCA, respectively, when the LSE error is constant. Moreover, Figure 4b,d,f represents how LSE error impacts the LST error for the MWA, RTE, and SCA, respectively, varying the brightness temperature values. The sensitivity analysis of nighttime data shows results with a trend similar to the daytime one; however, the variation in the LST error is smaller than daytime, also considering the lower brightness temperature values in the nighttime.

Accuracy of LST Retrieval Algorithms and LSE Models for Daytime
In Figure 5, the accuracy results of the LST retrieval methods for daytime Landsat 8 data are illustrated based on the six NDVI-based LSE models of Table 3. In this validation test at the nine SURFRAD and ARM stations, the Landsat 8 image pixel covering the pyrgeometers was selected, and the estimated LST compared with the corresponding ground LST measurement.
The accuracy varied between 2.17 K RMSE and 5.47 K RMSE considering all LST methods and LSE models. MWA method with LSE4 and LSE6 presented similar and best results for the daytime. Using MWA and LSE4, the RMSE, STD of Error, and Bias were 2.17 K, 1.86 K, and −1.13 K, respectively. Furthermore, the same statistical metrics, in the same order, were 2.17 K, 1.79 K, and −1.24 for MWA with LSE6. In general, the daytime results revealed that for all LSE models, except for LSE2, MWA showed slightly better results than RTE, and RTE demonstrated slightly better results than SCA. LSE1 and LSE2 did not offer satisfying results with any of the LST retrieval algorithms. Apart from that, the other LSE models presented acceptable daytime LST results with MWA, RTE, and SCA. The Bias is always negative regardless of the approach, highlighting a general overestimation of the Landsat 8 retrieval with respect to the in-situ measurements, especially for higher LST values.

Accuracy of LST Retrieval Algorithms and LSE Models for Daytime
In Figure 5, the accuracy results of the LST retrieval methods for daytime Landsat 8 data are illustrated based on the six NDVI-based LSE models of Table 3. In this validation test at the nine SURFRAD and ARM stations, the Landsat 8 image pixel covering the pyrgeometers was selected, and the estimated LST compared with the corresponding ground LST measurement.
The accuracy varied between 2.17 K RMSE and 5.47 K RMSE considering all LST methods and LSE models. MWA method with LSE4 and LSE6 presented similar and best results for the daytime. Using MWA and LSE4, the RMSE, STD of Error, and Bias were 2.17 K, 1.86 K, and −1.13 K, respectively. Furthermore, the same statistical metrics, in the same order, were 2.17 K, 1.79 K, and −1.24 for MWA with LSE6. In general, the daytime results revealed that for all LSE models, except for LSE2, MWA showed slightly better results than RTE, and RTE demonstrated slightly better results than SCA. LSE1 and LSE2 did not offer satisfying results with any of the LST retrieval algorithms. Apart from that, the other LSE models presented acceptable daytime LST results with MWA, RTE, and SCA. The Bias is always negative regardless of the approach, highlighting a general overestimation of the Landsat 8 retrieval with respect to the in-situ measurements, especially for higher LST values.

Accuracy of LST Retrieval Algorithms and LSE Models for Nighttime
In Figure 6, the accuracy assessment of the LST retrieval methods for nighttime Landsat 8 data is reported for the six NDVI-based LSE models. Considering all LST methods and LSE models for the nighttime, the RMSE values ranged from 0.94 K to 3.34 K. In the nighttime LST analysis, the SCA method with LSE5 presented the best results, with RMSE, STD of Error, and Bias equal to 0.94 K, 0.72 K, and 0.60 K, respectively. On the other hand, MWA and RTE also provided very high accuracy with the RMSE equal to 1.01 K and 0.95 K, respectively, when using with LSE5. In general, the nighttime results revealed that for all LSE models, except for LSE2, all LST retrieval methods provided good accuracies with the highest RMSE as 1.51 K. As a summary, Table 4 shows the best LST retrieval methods and LSE models for the proposed daytime and nighttime LST validation test at the nine SURFRAD and ARM stations. Compared to the daytime, during nighttime all LST retrieval methods provided highly accurate results with the different LSE models. Moreover, the overestimation of daytime LST retrieval is no longer evident at night, and the bias is clearly reduced. The proposed test with ground measurements as reference suggests that the use of daytime NDVI-based LSE, whose acquisition is close to nighttime data (the difference ranges from one day to four days in this work), is an accurate solution for the nighttime LST retrieval from thermal band observations. We assumed that the LSE does not significantly change in a short time period if rain and/or snow does not occur: This weather condition was verified for the selected images.

Discussion
Numerous factors affect the accuracy of the LST retrieval from satellite TIR data. Atmospheric profiles, sensor parameters (spectral range and viewing angle), and surface parameters (emissivity and geometry) are amongst the major factors. On the other hand, development of an LST retrieval method has its own error sources due to including some parameterization steps for the retrieval of coefficients and estimation of some initial parameters. Therefore, it is of great importance to conduct sensitivity/uncertainty analyses for a new method by considering all input parameters. Concerning the LST validation procedure in space sciences, two main error sources emerge from both ground-based LST and satellite-based LST. Examining the sensitivity analysis for ground-based LST measurements, it emerges that the reliability of the upwelling radiance measurements is a key factor for the overall accuracy of the LST computation. Then, the effect of LSE on satellite-based LST retrieval methods for both daytime and nighttime were investigated, since we proposed using the daytime LSE images for nighttime LST retrieval. The results showed that the LST sensitivity to LSE error is typically dependent on the brightness temperature values suggesting that areas and study periods with lower Tb could guarantee lower LST errors. Atmospheric parameters needed in the LST retrieval methods were obtained from the NASA's ACPC that is based on MODTRAN radiative transfer code. It is not possible to find in-situ (radiosonde data etc.) atmospheric profiles for any place and any time. Thus, even though this usage (a simulation of profile information on atmosphere with ACPC) affects the accuracy of the methods, it is clear from our results and literature that NASA's ACPC provides satisfactory and effective simulations.
Comparing the results obtained in this research with the ones of other similar studies would be helpful for the readers. The daytime LST results of this study were compatible with the results presented in our previous paper [34]. Yu et al. [51] investigated the daytime LST results from RTE and SCA methods using Landsat 8 data with LSE5. They determined the RMSE values for RTE and SCA as 0.9 K and 1.39 K, respectively. However, we obtained 2.71 K RMSE and 2.85 K RMSE for RTE and SCA, respectively, with the same LSE model. Wang et al. [105] revealed that the generalized SCA and Practical Single-Channel Algorithm (PSCA) presented 2.24 K and 1.77 K, respectively. We obtained 2.73 K RMSE with the SCA and same LSE model (LSE3). Sekertekin [47] obtained 3.12 K RMSE using RTE and LSE4, while it was 2.62 K RMSE in this test. Guo et al. [54] used SCA with daytime Landsat 8 data and obtained 2.74 K and 2.47 K RMSE before and after the stray light correction, respectively. In our study, SCA results ranged from 2.73 K to 2.85 K RMSE under different NDVI threshold-based LSE models. We also observed negative biases for the selected dataset, whereas Guo et al. [54] did not observe biases in their case study. These validation studies of Landsat 8-derived LST refer to the daytime data, and they suggest how the accuracies can differ in similar test sites if the number of scenes and their acquisition time change.
Validation studies were not previously published for nighttime LST from Landsat 8. This test shows that, compared to the daytime, the nighttime accuracy is better, the daytime LST overestimation is no longer present, and the bias is distinctly reduced. It is an interesting and beneficial result for the researchers thinking of using the nighttime LST data from Landsat-8. Further studies can be conducted in different land cover types including also urban areas to confirm the effectiveness of the nighttime LST results. However, it may be difficult to find reliable ground-based LST measurements for accuracy assessments in these different areas.
Satellite-based LST retrieval methods are generally developed considering different conditions and assumptions. Thus, no universal method is yet available to provide accurate LSTs from all satellite TIR data, and it cannot be said that one method is systematically superior to the others. Concerning the stationarity of the methods used in this study, since RTE and SCA are obtained by the radiative transfer equation solution, they are valid for each sensor and atmospheric condition. On the other hand, the MWA is linked to atmospheric parameters and fixed coefficients regardless of the sensor type. However, these coefficients could be refined for different sensors (with different bandwidths), and the results validated.

Conclusions
In this study, three LST retrieval algorithms, namely, RTE, SCA, and MWA, were evaluated using daytime and nighttime Landsat 8 OLI/TIRS data. To the best of our knowledge, this is the first study proposing the retrieval and validation of nighttime LST from TIR data of Landsat 8, also with a performance comparison with respect to daytime LST retrieval. Since LSE is one of the most important factors affecting the accuracy of LST retrieval methods, the effects of six NDVI-based LSE models on satellite-based LST accuracy were also investigated.
Concerning nighttime LST retrievals, we proposed the combined use of daytime LSE and nighttime TIR data when the difference in acquisitions of both datasets are close (a few days) and unchanged weather condition is observed. Concerning the evaluation of the LST retrieval methods and LSE models under daytime and nighttime conditions, SURFRAD and ARM SGP sites were used to calculate in-situ LST simultaneous with TIR data acquisitions.
In addition to the accuracy evaluation of the LST methods, we conducted detailed sensitivity/uncertainty analyses for in-situ measurements and sensitivity of LST methods on LSE for both daytime and nighttime. Considering the daytime sensitivity results of in-situ measurements, we proved that ±5 W/m 2 error in downwelling and upwelling radiance led to ±0.024 K and ±0.8 K error in LST, respectively, and 0.01 error in the broadband emissivity caused ±0.25 K error in LST. On the other hand, concerning the nighttime sensitivity results of in-situ measurements, we observed ±5 W/m 2 error in downwelling and upwelling radiance caused ±0.029 K and ±0.95 K error in LST, respectively, and 0.01 error in the broadband emissivity provided ±0.12 K error in LST. The sensitivity results of in-situ LST measurements revealed that the uncertainty of the downwelling and upwelling radiance was almost identical in daytime and nighttime. Nevertheless, the uncertainty of the broadband emissivity in the nighttime was half of that in the daytime.
Then, we investigated the sensitivity of the LST methods to LSE for both daytime and nighttime LST retrievals. The sensitivity results indicated that when the LSE error was constant for MWA and SCA, the LST error increased with increasing brightness temperature. However, when the LSE error was constant for RTE, LST error was stable with increasing brightness temperature. On the other hand, the nighttime sensitivity analysis showed identical trends to daytime ones; however, the variation in the LST error was smaller than daytime mainly due to the lower brightness temperatures.
The accuracy results of the daytime Landsat 8 data at the nine ground stations showed that the MWA method with LSE4 and LSE6 presented the best results for the daytime. In general, for all the LSE models, except for the LSE2, the MWA indicated slightly better results than the RTE, and the RTE demonstrated slightly better results than the SCA for daytime LST retrievals. Considering the nighttime, the SCA method with LSE5 presented the best results. However, MWA and RTE provided very similar results with SCA. Compared to the daytime, all LST retrieval methods provided highly accurate results with the different LSE models in the nighttime. The systematic overestimation of daytime LST retrieval is no longer present at night, with an evident reduced bias. The validation test shows that the use of daytime NDVI-based LSE with reflective data close to nighttime thermal data is a reliable solution for the nighttime LST retrieval.