Surface Shortwave Net Radiation Estimation from Landsat TM / ETM + Data Using Four Machine Learning Algorithms

: Surface shortwave net radiation (SSNR) ﬂux is essential for the determination of the radiation energy balance between the atmosphere and the Earth’s surface. The satellite-derived intermediate SSNR data are strongly needed to bridge the gap between existing coarse-resolution SSNR products and point-based measurements. In this study, four di ﬀ erent machine learning (ML) algorithms were tested to estimate the SSNR from the Landsat Thematic Mapper (TM) / Enhanced Thematic Mapper Plus (ETM + ) top-of-atmosphere (TOA) reﬂectance and other ancillary information (i.e., clearness index, water vapor) at instantaneous and daily scales under all sky conditions. The four ML algorithms include the multivariate adaptive regression splines (MARS), backpropagation neural network (BPNN), support vector regression (SVR), and gradient boosting regression tree (GBRT). Collected in-situ measurements were used to train the global model (using all data) and the conditional models (in which all data were divided into subsets and the models were ﬁtted separately). The validation results indicated that the GBRT-based global model (GGM) performs the best at both the instantaneous and daily scales. For example, the GGM based on the TM data yielded a coe ﬃ cient of determination value ( R 2 ) of 0.88 and 0.94, an average root mean square error (RMSE) of 73.23 W · m -2 (15.09%) and 18.76 W · m -2 (11.2%), and a bias of 0.64 W · m -2 and –1.74 W · m -2 for instantaneous and daily SSNR, respectively. Compared to the Global LAnd Surface Satellite (GLASS) daily SSNR product, the daily TM-SSNR showed a very similar spatial distribution but with more details. Further analysis also demonstrated the robustness of the GGM for various land cover types, elevation, general atmospheric conditions, and seasons


Introduction
Surface energy fluxes profoundly affect our ability to understand how Earth's climate responds to increasing concentrations of greenhouse gases [1]. Surface shortwave net radiation (SSNR) is defined

In-Situ Measurements
Mathematically, SSNR can be expressed as: where R g and R gout are the downwelling and upwelling shortwave radiation fluxes at the surface, and α is the surface broadband shortwave albedo. The SSNR measurements were calculated from the observed R g and R gout and collected from 171 sites belonging to 10 global measurement networks during 1994-2011. Table 2 provides detailed information on each network. Figure 1 shows the geographical distributions of all sites and the climate zones they belong to. These climate zones are defined following the Köppen-Geiger climate classification [48]. These sites are located across the globe and therefore represent the different climatic and ecosystem conditions, ranging from the Arctic to the Antarctic. Figure 2 shows the elevations of all sites ranging from 1 to 3423 m above sea level, and Table 3 provides the statistics of the land cover types of these sites defined by the International Geosphere-Biosphere Programme (IGBP). In this paper, 11 land cover types were considered, including evergreen needleleaf forest (ENF), evergreen broadleaf forest (EBF), deciduous needleleaf forest (DNF), deciduous broadleaf forest (DBF), mixed forest (MF), shrublands (SHB), savannas (SAV), grasslands (GRA), permanent wetlands (PW), croplands (CRO), and barren lands (BR). Note that some land cover types (i.e., ice, snow and water body) were not considered due to the lack of available samples. The comprehensive representation of land cover types and climate Remote Sens. 2019, 11, 2847 4 of 29 types, widespread spatial distribution, and different elevations allows the global applicability of the model to be developed and evaluated.  [44,45]. 4 EOL: Earth Observing Laboratory. 5 GC_NET: Greenland Climate Network [46]. 6 La Thuile: Global Fluxnet (La Thuile dataset). 7 SAFARI.2000: Southern African Regional Science Initiative Project. 8 SURFRAD: Surface Radiation Network [47].
Remote Sens. 2019, 11, x FOR PEER REVIEW 4 of 30   [46]. 6 La Thuile: Global Fluxnet (La Thuile dataset). 7 SAFARI.2000: Southern African Regional Science Initiative Project. 8 SURFRAD: Surface Radiation Network [47]. The radiation measurements were all pre-processed according to the requirements of this study. After strict quality control [49], the timing of these observations was first converted into solar time for consistency, following which the radiation measurements ( and ) were aggregated into hourly (transit time being taken as the midpoint) and diurnal mean values. In this study, "instantaneous" is defined as the hourly scale, considering the various sampling time of each site (see Table 2). Lastly, the instantaneous and daily SSNR observations were calculated for each site and  The radiation measurements were all pre-processed according to the requirements of this study. After strict quality control [49], the timing of these observations was first converted into solar time for consistency, following which the radiation measurements (R g and R gout ) were aggregated into hourly (transit time being taken as the midpoint) and diurnal mean values. In this study, "instantaneous" is defined as the hourly scale, considering the various sampling time of each site (see Table 2). Lastly, the instantaneous and daily SSNR observations were calculated for each site and their values are shown in Table 3. To ensure the quality of all estimates, daily values were only calculated if at least one observation was available in each hour during a diurnal period, and any unreasonable values were manually removed at last.

Remotely Sensed Data
The TOA reflectance data from the Landsat 5/TM and 7/ETM+ were used as inputs for the new model. For comparison purposes, the GLASS (Global LAnd Surface Satellite) SSNR products (calculated from downwelling solar radiation and surface albedo) were also used.
• Landsat 5/TM and 7/ETM+ data Landsat project is a joint initiative between the U.S. Geological Survey (USGS) and NASA. With a time coverage dating back to the 1980s, the Landsat multispectral data represent the world's longest continuously acquired collection of space-based land data, and has a relatively high resolution (~1 km) compared with that of other remote sensing projects. In this study, the data from Landsat 5/TM were used for modeling and the Landsat 7/ETM+ data were used for the evaluation of the model applicability. The data from the TM and ETM+ sensors onboard the Landsat 5 and 7 satellite have been proven extremely useful for various field studies such as land surface radiation budget [30,50,51], surface temperature [52], or surface albedo [53] estimation. The characteristics of all bands in Landsat 5/TM and Landsat 7/ETM+ are shown in Table 4, demonstrating that the bands properties from the two sensors are nearly the same for bands 1-7. Therefore, the data from the two sensors are used together in this study from 1994 to 2011. After geometric and radiometric calibration, the TOA reflectance (band 1-5, and 7), BT (band 6), geometry information (SZA and SAA), and transit time of Landsat 5/TM and 7/ETM+ could be obtained directly. Then, the pixels corresponding to each site location were extracted in all available TM or ETM+ scenes. In the data pre-processing procedure, the LEDAPS (Landsat Ecosystem Disturbance Adaptive Processing System) atmosphere correction tool (version 3.3) [54,55] was applied and the generated cloud mask (using a method for cloud and cloud shadow detection, called Fmask (function of mask) [56])) was also used to classify the sky conditions, such as cloudy (qc values ∈ [2,34]), cloud shadow (qc values ∈ [4,12,20,36,52]), and clear (qc values ∈ others), in this study. As mentioned by Zhu et al. [56], the overall cloud accuracy of Fmask was as high as 96.4%, and even the worst accuracy for cloud shadow was still more than 70%. The Fmask has been used frequently in previous studies [57]. Note that the saturated pixels whose TOA values in some bands are abnormal due to highly reflective surfaces (i.e., snow or clouds) were excluded [58,59].

• GLASS products
The GLASS satellite products of daily downwelling solar radiation (DSR) [3,21] and daily shortwave broadband albedo [3,60] at 5 km spatial resolution were used in this study to obtain the daily SSNR, according to Equation (1), for comparison purpose. GLASS DSR product was retrieved from multiple polar-orbiting and geostationary satellite data by LUT. GLASS's albedo product is produced from the advanced very high-resolution radiometer (AVHRR) and MODIS data based on different estimation algorithms. The two GLASS products have previously been evaluated using ground measurements and compared with other products [61,62], and the results of this comparison demonstrated that their accuracy is satisfactory, and even exceeded that of most of the other products. The comparison period was from 2000 and 2011 due to the limitation of the available GLASS data. The GLASS DSR and albedo products are available at: http://glass-product.bnu.edu.cn/.

MERRA-2 Reanalysis Data
The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), is a global atmospheric reanalysis dataset produced by the NASA Global Modeling and Assimilation Office (GMAO) [63] at 0.5 x 0.625 • spatial resolution. MERRA-2 is the updated version of MERRA [64], with improved estimates for various parameters, especially those related to the hydrological cycle. The hourly total precipitable water vapor (WV, g/cm 2 ) from MERRA-2 M2T1NXSLV [65] was extracted for each site and aggregated into average daily values.

Other Parameters
In this study, CI, which represents the daily atmospheric transmittance [66], was used as an input. It is defined as the ratio of the daily solar radiation and the extraterrestrial radiation as follows: where R g (W·m -2 ) is from the GLASS DSR product, and R se (W·m -2 ) is the extraterrestrial radiation calculated using Equation (3) from [67]: where G sc is the solar constant (0.0820 MJm -2 ·min -1 ),d r the inverse relative distance from the Earth to the Sun, ω s the sunset hour angle (rad), ϕ the latitude (rad), δ the solar declination (rad), z the elevation (m), and DOY is the day of the year.

Methods
The flowchart delineating the various processing for the implementation of the method described in this study is shown in Figure 3. After collection of the input data, the data were first randomly separating into two parts: 80% (13,116 samples for instantaneous and 9722 for daily) for training and the other 20% (3268 for instantaneous and 2432 for daily) for independent validation. Following this, four machine learning algorithms, including the multivariate adaptive regression splines (MARS) [68], the backpropagation neural network (BPNN) [69], the support vector regression (SVR) [70], and the gradient boosting regression tree (GBRT) [71], were used for the instantaneous SSNR (Ins-SSNR) and the daily SSNR (D-SSNR) estimation in the global mode (use of all data to fit the model) and conditional mode (classification of the data into clear, cloudy, and cloud shadows for instantaneous scale, and by three intervals of CI for daily scale, with the models being fitted separately), respectively. Thirdly, the optimal algorithm was determined for the Ins-and D-SSNR estimation after a comprehensive assessment. To be more objective, the performances of the new developed models were compared with the results from the WangLUT method, and more analytical experiments were also implemented to illustrate their robustness. Finally, the generated D-SSNR from Landsat 5/TM data was compared with the GLASS product. In this study, all the algorithms were implemented under the Microsoft Windows 10 system on an Intel Core 3.40 GHz PC with 8 GB memory. Details of the four algorithms are provided in Appendix A.
A 2-fold cross validation was conducted for the validation of the training accuracy, and four statistic index were used: The determination coefficient (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and bias: where e i is the estimated value, and o i is the ground observation for the i th pair.  A 2-fold cross validation was conducted for the validation of the training accuracy, and four statistic index were used: The determination coefficient (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and bias: where is the estimated value, and is the ground observation for the i th pair.

Model Development
After pre-processing, 16,384 instantaneous samples (13,116 for training and 3268 for validation) and 12,154 daily samples (9722 for training and 2432 for validation) were collected from 171 sites

Model Development
After pre-processing, 16,384 instantaneous samples (13,116 for training and 3268 for validation) and 12,154 daily samples (9722 for training and 2432 for validation) were collected from 171 sites between 1994 and 2011 for Ins-SSNR and D-SSNR estimation models development and validation, respectively. The models (Ins-SSNR or D-SSNR) were fitted with the observations in two cases. In case 1, the models were fitted using all available observations, i.e., global model. In case 2, the observations were divided into several subsets based on the cloud mask (for Ins-SSNR) or CI (for D-SSNR), and the models thus obtained are referred to as conditional models.

Ins-SSNR
The various effects of the atmosphere on the solar radiations (i.e., cloud, water vapor) need to be accounted for [32] to ensure accuracy of the estimates. For instance, in the WangLUT method, the SSNR estimates need to be corrected for the influence of water vapor. To highlight the difference with the WangLUT method, the BT and the WV were taken as input dimensions to estimate the Ins-SSNR (see Table 1). As Equation (1) shows, the SSNR was determined by R g and surface shortwave broadband albedo, which was closely related to the land surface types and their change. Some previous studies demonstrated that Landsat BT data have been widely used for land cover classification and assessment of land use change [72,73], hence it was reasonable to be used as input to estimate the SSNR. The results with or without BT and/or WV are discussed later.
To ensure objectivity of the estimates, the model training accuracy was evaluated using a 2-fold cross-validation method with all Ins-SSNR training samples. Table 5 shows the training accuracy results of four Ins-SSNR estimation models in global mode. The results illustrate that the Ins-SSNR GBRT global model (GGM, hereinafter) performed the best, with a coefficient of determination R 2 of 0.86, an RMSE of 75.72 W·m -2 , an MAE of 51.51 W·m -2 and a bias of -0.38 W·m -2 , when the BT and WV were both added as inputs. Thus, the GBRT algorithm had acceptable performances and a very short fitting time (0.03 sec), although the training time (2 hours) was a little longer.
As a global mode, Ins-SSNR conditional models under clear-, cloudy-, and cloud shadow-sky with different inputs were also developed. After several rounds of experiments, the GBRT algorithm was found to have superior performances to the other three algorithms in fitting efficiency and estimation accuracy. Hence, it was applied to develop the Ins-SSNR conditional models, and the results of the conditional model under these three different sky conditions are listed in Table 6. These results indicate that the Ins-SSNR GBRT conditional models (GCMs, hereinafter) performed the best under any condition when both the BT and WV were added as inputs. Relative to the cloud conditions, the accuracy of the model under cloud shadow is the worst, which is possibly due to the lower number of samples under this condition (887). It was also found that, with a higher RMSE (77.92 W·m -2 ) and MAE (52.88 W·m -2 ) the overall accuracy of the GCM, when combining the three conditional models results together (the last row in Table 6), is less good than that of the GGM (Table 5). Thus, the accuracy of the results from the GGM (with BT and WV as inputs) under the same three conditions was also examined, and is also displayed in Table 6. These results demonstrate that for most metrics, the performances in terms of accuracy of the GGM under the three conditions are better than those from the GCM. This is especially true in the case of cloud shadow condition with values of R 2 of 0.78 against 0.71, RMSEs of 90.89 against 102.35 W·m -2 , and MAE of 67.89 against 77.06 W·m -2 , respectively. Although the bias presents a less good result for the GGM with a value of 13.91 W·m -2 , the predicted results are closer to the 1:1 line and more concentrated. Therefore, it can be concluded that the GGM with BT and WV as inputs is the optimal model for Ins-SSNR estimation. Besides, the contribution of each input in GGM for the Ins-SSNR estimation are displayed in Figure 4. The Ins-SSNR estimations are shown to be more sensitive to BT and WV than to most of the other bands, thus highlighting the importance of taking BT and WV as inputs.
Remote Sens. 2019, 11, x FOR PEER REVIEW 11 of 30 Besides, the contribution of each input in GGM for the Ins-SSNR estimation are displayed in Figure 4. The Ins-SSNR estimations are shown to be more sensitive to BT and WV than to most of the other bands, thus highlighting the importance of taking BT and WV as inputs.

D-SSNR
In the WangLUT method, the D-SSNR was calculated from the integral of the Ins-SSNR based on the assumption that the atmospheric conditions were moderately invariant during the daytime. However, this assumption was found to be unreasonable after statistical analysis (see Figure 5). The cloud mask classification results were used as the atmospheric condition at the transit time (y axis), and the CI represents the daily average atmospheric condition (x axis). The results in this figure illustrate that no significant correlation exists between the two atmospheric conditions (instantaneous and daily) at the same date. To address this issue in this study, the CI was considered as one input to represent the daily average atmospheric condition. As in the Ins-SSNR, the D-SSNR estimation models were also developed using four algorithms with different inputs for both the global and conditional modes. However, in the D-SSNR case, the daily conditions are classified by CI, with three categories obtained: CI < 0.2 for overcast sky, 0.2 ≤ CI < 0.7 for cloudy sky, and CI ≥ 0.7 for clear sky. Following the results of the Ins-SSNR model testing, both BT and WV are taken as inputs in the D-SSNR modeling. The 2-fold cross-validation results of D-SSNR estimation models using four algorithms in global mode with or without CI are summarized in Table 7. These results demonstrate that all models performed better when CI was added as an input. In this case too, the GGM exhibits the best

D-SSNR
In the WangLUT method, the D-SSNR was calculated from the integral of the Ins-SSNR based on the assumption that the atmospheric conditions were moderately invariant during the daytime. However, this assumption was found to be unreasonable after statistical analysis (see Figure 5). The cloud mask classification results were used as the atmospheric condition at the transit time (y axis), and the CI represents the daily average atmospheric condition (x axis). The results in this figure illustrate that no significant correlation exists between the two atmospheric conditions (instantaneous and daily) at the same date. To address this issue in this study, the CI was considered as one input to represent the daily average atmospheric condition. As in the Ins-SSNR, the D-SSNR estimation models were also developed using four algorithms with different inputs for both the global and conditional modes. However, in the D-SSNR case, the daily conditions are classified by CI, with three categories obtained: CI < 0.2 for overcast sky, 0.2 ≤ CI < 0.7 for cloudy sky, and CI ≥ 0.7 for clear sky. Following the results of the Ins-SSNR model testing, both BT and WV are taken as inputs in the D-SSNR modeling.
Remote Sens. 2019, 11, x FOR PEER REVIEW 11 of 30 Besides, the contribution of each input in GGM for the Ins-SSNR estimation are displayed in Figure 4. The Ins-SSNR estimations are shown to be more sensitive to BT and WV than to most of the other bands, thus highlighting the importance of taking BT and WV as inputs.

D-SSNR
In the WangLUT method, the D-SSNR was calculated from the integral of the Ins-SSNR based on the assumption that the atmospheric conditions were moderately invariant during the daytime. However, this assumption was found to be unreasonable after statistical analysis (see Figure 5). The cloud mask classification results were used as the atmospheric condition at the transit time (y axis), and the CI represents the daily average atmospheric condition (x axis). The results in this figure illustrate that no significant correlation exists between the two atmospheric conditions (instantaneous and daily) at the same date. To address this issue in this study, the CI was considered as one input to represent the daily average atmospheric condition. As in the Ins-SSNR, the D-SSNR estimation models were also developed using four algorithms with different inputs for both the global and conditional modes. However, in the D-SSNR case, the daily conditions are classified by CI, with three categories obtained: CI < 0.2 for overcast sky, 0.2 ≤ CI < 0.7 for cloudy sky, and CI ≥ 0.7 for clear sky.
Following the results of the Ins-SSNR model testing, both BT and WV are taken as inputs in the D-SSNR modeling. The 2-fold cross-validation results of D-SSNR estimation models using four algorithms in global mode with or without CI are summarized in Table 7. These results demonstrate that all models performed better when CI was added as an input. In this case too, the GGM exhibits the best The 2-fold cross-validation results of D-SSNR estimation models using four algorithms in global mode with or without CI are summarized in Table 7. These results demonstrate that all models performed better when CI was added as an input. In this case too, the GGM exhibits the best performance with a R 2 of 0.93, an RMSE of 21.01 W·m -2 , an MAE of 14.07 W·m -2 , and a bias of 0.12 W·m -2 . The GBRT algorithm also works the best in conditional mode, thereby, only the results using GRBT algorithm are delineated here. Overall, the accuracy was better under all conditions when CI was added, especially for CI [0.2, 0.7), with a significant increased accuracy (R 2 of 0.91 and 0.87, RMSEs of 21.06 and 27.08 W·m -2 , MAEs of 14.55 and 19.42 W·m -2 , and biases of 0.24 and -2.53 W·m -2 , with and without CI as an input, respectively). However, the accuracy was the worst for CI < 0.2, which illustrates how thick clouds block much of the surface information. The overall accuracy from GGM (with CI) ( Table 8) is similar to the combined results from the three GCM (with CI) (the last row in Table 9) with R 2 of 0.93 and 0.92, RMSEs of 21.01 and 21.34 W·m -2 , MAEs of 14.07 and 14.52 W·m -2 , and biases of 0.12 and 0.04 W·m -2 , respectively. The accuracy of the results from the D-SSNR GGM with CI under three conditions was also examined and is shown in Table 8. Similar conclusions as for the Ins-SSNR estimation are reached, i.e., that the GGM for D-SSNR estimations performs better than the GCM, particularly under the CI < 0.   The results of the importance of each input in the GGM ( Figure 6) also indicate that the addition of CI has a remarkable impact on D-SSNR estimation, thus highlighting the necessity of adding CI as an input. The use of CI was therefore chosen for the D-SSNR estimation model of the GGM.

Direct Validation
The performance of the GGM for Ins-SSNR and D-SSNR estimates was evaluated with different inputs (BT and WV for Ins-SSNR, CI for D-SSNR), and the results were validated against independent validation samples (3268 for Ins-SSNR, 2432 for D-SSNR) as shown in Figure 7.  The Ins-SSNR and D-SSNR results from WangLUT method were also validated for comparison, these results being shown in Figures 8 and 9. Note, however, that as the WangLUT method is only applicable under clear-and cloud-sky conditions, when WV is larger than 0.5 g·cm -2 , only the in-situ samples corresponding to these conditions (2901 for Ins-SSNR, 2010 for D-SSNR) were used for the validation.

Direct Validation
The performance of the GGM for Ins-SSNR and D-SSNR estimates was evaluated with different inputs (BT and WV for Ins-SSNR, CI for D-SSNR), and the results were validated against independent validation samples (3268 for Ins-SSNR, 2432 for D-SSNR) as shown in Figure 7.

Direct Validation
The performance of the GGM for Ins-SSNR and D-SSNR estimates was evaluated with different inputs (BT and WV for Ins-SSNR, CI for D-SSNR), and the results were validated against independent validation samples (3268 for Ins-SSNR, 2432 for D-SSNR) as shown in Figure 7. The Ins-SSNR and D-SSNR results from WangLUT method were also validated for comparison, these results being shown in Figures 8 and 9. Note, however, that as the WangLUT method is only applicable under clear-and cloud-sky conditions, when WV is larger than 0.5 g·cm -2 , only the in-situ samples corresponding to these conditions (2901 for Ins-SSNR, 2010 for D-SSNR) were used for the validation. The Ins-SSNR and D-SSNR results from WangLUT method were also validated for comparison, these results being shown in Figures 8 and 9. Note, however, that as the WangLUT method is only applicable under clear-and cloud-sky conditions, when WV is larger than 0.5 g·cm -2 , only the in-situ samples corresponding to these conditions (2901 for Ins-SSNR, 2010 for D-SSNR) were used for the validation.   The validation accuracy of D-SSNR from the GGM is also better than the one from the WangLUT method (Figure 9), with an RMSE value of 18.08 W·m -2 and bias of -0.77 W·m -2 , which is even better than the results of previous studies [32]. The D-SSNR estimates are also shown to be closer to the 1:1 line, though the CI is very small (blue square) compared to the one from the WangLUT method.
The various comparison results clearly show that the predictive abilities of the GGM are better than those of WangLUT at both the instantaneous and daily time scales, while also verifying the limitations of the simulations.

Comparison with GLASS Product
A distribution map of D-SSNR was generated with the GGM (with CI as an input) from one Landsat 5/TM scene (path 123, row 32, acquired on 01/05/2003) (TM D-SSNR, Figure 10b).  The validation accuracy of D-SSNR from the GGM is also better than the one from the WangLUT method (Figure 9), with an RMSE value of 18.08 W·m -2 and bias of -0.77 W·m -2 , which is even better than the results of previous studies [32]. The D-SSNR estimates are also shown to be closer to the 1:1 line, though the CI is very small (blue square) compared to the one from the WangLUT method.
The various comparison results clearly show that the predictive abilities of the GGM are better than those of WangLUT at both the instantaneous and daily time scales, while also verifying the limitations of the simulations.

Comparison with GLASS Product
A distribution map of D-SSNR was generated with the GGM (with CI as an input) from one Landsat 5/TM scene (path 123, row 32, acquired on 01/05/2003) (TM D-SSNR, Figure 10b).  The validation accuracy of D-SSNR from the GGM is also better than the one from the WangLUT method (Figure 9), with an RMSE value of 18.08 W·m -2 and bias of -0.77 W·m -2 , which is even better than the results of previous studies [32]. The D-SSNR estimates are also shown to be closer to the 1:1 line, though the CI is very small (blue square) compared to the one from the WangLUT method.
The various comparison results clearly show that the predictive abilities of the GGM are better than those of WangLUT at both the instantaneous and daily time scales, while also verifying the limitations of the simulations.

Comparison with GLASS Product
A distribution map of D-SSNR was generated with the GGM (with CI as an input) from one Landsat 5/TM scene (path 123, row 32, acquired on 01/05/2003) (TM D-SSNR, Figure 10b). Simultaneously, the GLASS SSNR was calculated in the same area (GLASS D-SSNR, Figure 10c). The original TM image in false color is also shown in Figure 10a, including different land cover types, such as urban land (light blue color), bare land (gray color), vegetation (red color), and water body (black or navy blue color). Simultaneously, the GLASS SSNR was calculated in the same area (GLASS D-SSNR, Figure 10c). The original TM image in false color is also shown in Figure 10a, including different land cover types, such as urban land (light blue color), bare land (gray color), vegetation (red color), and water body (black or navy blue color). Overall, the spatial patterns of GLASS D-SSNR or TM D-SSNR are reasonable and agree well with the color composite image, e.g., the TM D-SSNR presents larger values for dense vegetation (red color in Figure 10a) because of a lower albedo, whereas for urban and bare land (blue and gray color in Figure 10a), the TM D-SSNR values are lower because of the strong reflection. Comparatively, the TM D-SSNR provides more details for characterizing the texture of different surface features and contours variations in topography (Figure 10b). Further, the histograms of D-SSNR values from the GLASS and TM are shown in Figure 11. While the shapes are similar, the range of D-SSNR values is broader for TM than for GLASS, with a more detailed distribution, which illustrates the discrepancies caused by the coarser spatial resolution of GLASS (0.05° against 30 m for TM). Overall, the spatial patterns of GLASS D-SSNR or TM D-SSNR are reasonable and agree well with the color composite image, e.g., the TM D-SSNR presents larger values for dense vegetation (red color in Figure 10a) because of a lower albedo, whereas for urban and bare land (blue and gray color in Figure 10a), the TM D-SSNR values are lower because of the strong reflection. Comparatively, the TM D-SSNR provides more details for characterizing the texture of different surface features and contours variations in topography (Figure 10b).
Further, the histograms of D-SSNR values from the GLASS and TM are shown in Figure 11. While the shapes are similar, the range of D-SSNR values is broader for TM than for GLASS, with a more detailed distribution, which illustrates the discrepancies caused by the coarser spatial resolution of GLASS (0.05 • against 30 m for TM). Further, the histograms of D-SSNR values from the GLASS and TM are shown in Figure 11. While the shapes are similar, the range of D-SSNR values is broader for TM than for GLASS, with a more detailed distribution, which illustrates the discrepancies caused by the coarser spatial resolution of GLASS (0.05° against 30 m for TM).
(a) (b) Figure 11. Histograms of D-SSNR values from GLASS (a) and TM (b) for the data displayed in Figure 10b,c.

Model Performance Analysis
To provide a more comprehensive evaluation, the performances of the GGM for various land cover types and elevation ranges are also discussed in the following chapter, with the results of WangLUT taken as reference.
• Land cover type As mentioned in Section 2, 11 land cover types were considered: ENF, EBF, DNF, DBF, GRA, MF, SHB, SAV, PW, CRO, and BR. The validation results for the various land cover types are shown in terms of normalized RMSE (NRMSE) to account for the different number of validation samples for each type: where X represents all of the observation values belonging to the same type, and the root mean square error RMSE is also calculated for the given type. The range of NRMSE is typically between 0 and 1, although in a few cases, such as when the values of X are small, it may exceed 1. A value of NRMSE close to 0 means that the accuracy of the estimation is high. As shown in Figure 12, the NRMSE values from GGM are found to be generally smaller than the ones from WangLUT method for most land cover types, except in D-SSNR for EBF, and in Ins-SSNR for DNF. The estimation accuracy was found to be better for D-SSNR than for Ins-SSNR for both methods, and overall it seems that both models perform well for land cover types with low heterogeneity, such as CRO and SAV. The less good performance of the GGM for the EBF and DNF land cover types might be explained by the lesser number of samples collected for modeling. Overall, the results demonstrate the good applicability of the GGM for various land surfaces at both the instantaneous and daily scales.

• Elevation
For high-resolution images, such as those from Landsat, elevation is an important factor affecting SSNR [50]. Here, validation samples are divided into eight elevation groups, shown in Table 9, and NRMSE results for the various groups are shown in Figure 13 for both the GGM and the WangLUT method.
for DNF. The estimation accuracy was found to be better for D-SSNR than for Ins-SSNR for both methods, and overall it seems that both models perform well for land cover types with low heterogeneity, such as CRO and SAV. The less good performance of the GGM for the EBF and DNF land cover types might be explained by the lesser number of samples collected for modeling. Overall, the results demonstrate the good applicability of the GGM for various land surfaces at both the instantaneous and daily scales.

Elevation
For high-resolution images, such as those from Landsat, elevation is an important factor affecting SSNR [50]. Here, validation samples are divided into eight elevation groups, shown in Table  9, and NRMSE results for the various groups are shown in Figure 13 for both the GGM and the WangLUT method.  As shown in Figure 13, the NRMSE values from GGM are generally smaller than those from WangLUT for both D-SSNR and Ins-SSNR. The GGM values are relatively invariant and rarely exceed 0.2, except for the >3000 m interval. This may be explained by the large values of SZAs for the SSNR observations above 3000 m, which are greater than 60 degrees, and make it difficult to estimate SSNR [32]. Besides, the performances of GGM for D-SSNR estimates are better than Ins-SSNR for all elevation ranges. Note that it was inevitable that many stations are located in mountainous areas, and the performances of these methods in these regions need to be further evaluated.

• Sky condition
The applicability of GGM under various daily atmospheric conditions was also analyzed. As mentioned above, the daily average atmospheric conditions are represented by CI, and the number of validation samples in each CI interval (with a 0.1 increment) was counted and shown in Figure 14. Most observations were collected when CI was between 0.5-0.75, which means that cloudy-sky conditions were the most common. The RMSE and the NRMSE were calculated for each CI interval and are shown in Figure 15.  Figure 13. The same as Figure 12, but for different elevation ranges.
As shown in Figure 13, the NRMSE values from GGM are generally smaller than those from WangLUT for both D-SSNR and Ins-SSNR. The GGM values are relatively invariant and rarely exceed 0.2, except for the >3000 m interval. This may be explained by the large values of SZAs for the SSNR observations above 3000 m, which are greater than 60 degrees, and make it difficult to estimate SSNR [32]. Besides, the performances of GGM for D-SSNR estimates are better than Ins-SSNR for all elevation ranges. Note that it was inevitable that many stations are located in mountainous areas, and the performances of these methods in these regions need to be further evaluated.

Sky condition
The applicability of GGM under various daily atmospheric conditions was also analyzed. As mentioned above, the daily average atmospheric conditions are represented by CI, and the number of validation samples in each CI interval (with a 0.1 increment) was counted and shown in Figure 14.
Most observations were collected when CI was between 0.5-0.75, which means that cloudy-sky conditions were the most common. The RMSE and the NRMSE were calculated for each CI interval and are shown in Figure 15.   Figure 13. The same as Figure 12, but for different elevation ranges.
As shown in Figure 13, the NRMSE values from GGM are generally smaller than those from WangLUT for both D-SSNR and Ins-SSNR. The GGM values are relatively invariant and rarely exceed 0.2, except for the >3000 m interval. This may be explained by the large values of SZAs for the SSNR observations above 3000 m, which are greater than 60 degrees, and make it difficult to estimate SSNR [32]. Besides, the performances of GGM for D-SSNR estimates are better than Ins-SSNR for all elevation ranges. Note that it was inevitable that many stations are located in mountainous areas, and the performances of these methods in these regions need to be further evaluated.

Sky condition
The applicability of GGM under various daily atmospheric conditions was also analyzed. As mentioned above, the daily average atmospheric conditions are represented by CI, and the number of validation samples in each CI interval (with a 0.1 increment) was counted and shown in Figure 14.
Most observations were collected when CI was between 0.5-0.75, which means that cloudy-sky conditions were the most common. The RMSE and the NRMSE were calculated for each CI interval and are shown in Figure 15.  The NRMSE is very large (up to nearly 300%) for CI values between 0 and 0.2, indicating that D-SSNR cannot be estimated properly under overcast sky. When CI increases, the NRMSE value is gradually reduced and reaches a relatively stable value (less than 0.2). Hence, it can be concluded that the GGM works better under medium to high atmospheric transmittance conditions (CI > 0.4).

Seasonal analysis
For better understanding the performance of GGM in different season, the variations in the validation accuracies for Ins-and D-SSNR were explored for each month shown in Figure 16. And Figure 17 gives the distribution of the validation datasets in each month at instantaneous and daily scales. It can be found that the samples were mostly collected from May to September.  The NRMSE is very large (up to nearly 300%) for CI values between 0 and 0.2, indicating that D-SSNR cannot be estimated properly under overcast sky. When CI increases, the NRMSE value is gradually reduced and reaches a relatively stable value (less than 0.2). Hence, it can be concluded that the GGM works better under medium to high atmospheric transmittance conditions (CI > 0.4).
• Seasonal analysis For better understanding the performance of GGM in different season, the variations in the validation accuracies for Ins-and D-SSNR were explored for each month shown in Figure 16. And Figure 17 gives the distribution of the validation datasets in each month at instantaneous and daily scales. It can be found that the samples were mostly collected from May to September.
As shown in Figure 17, the obvious seasonal characteristics were revealed for both Ins-and D-SSNR estimations. The NRMSE values of Ins-SSNR and D-SSNR were generally both lower from April to August than other months, and the RMSE or NRMSE variations were more stable for D-SSNR than Ins-SSNR, which was possibly caused by the intense interactions between land surface and atmosphere during this period. Overall, the variation trends is similar to the existing studies that the RMSE value is higher in summer (represents June, July, and August) than in winter (represents December, January, and February) [74], and the performances of GGM were acceptable for different seasons. The NRMSE is very large (up to nearly 300%) for CI values between 0 and 0.2, indicating that D-SSNR cannot be estimated properly under overcast sky. When CI increases, the NRMSE value is gradually reduced and reaches a relatively stable value (less than 0.2). Hence, it can be concluded that the GGM works better under medium to high atmospheric transmittance conditions (CI > 0.4).

Seasonal analysis
For better understanding the performance of GGM in different season, the variations in the validation accuracies for Ins-and D-SSNR were explored for each month shown in Figure 16. And Figure 17 gives the distribution of the validation datasets in each month at instantaneous and daily scales. It can be found that the samples were mostly collected from May to September.  As shown in Figure 17, the obvious seasonal characteristics were revealed for both Ins-and D-SSNR estimations. The NRMSE values of Ins-SSNR and D-SSNR were generally both lower from April to August than other months, and the RMSE or NRMSE variations were more stable for D-SSNR than Ins-SSNR, which was possibly caused by the intense interactions between land surface and atmosphere during this period. Overall, the variation trends is similar to the existing studies that the RMSE value is higher in summer (represents June, July, and August) than in winter (represents December, January, and February) [74], and the performances of GGM were acceptable for different seasons.

Extended Use to Landsat 7/ETM+ Data
Landsat 7/ETM+ data is the subsequent version of Landsat 5/TM and the bands from the two sensors are nearly the same in the band range 1-7 (see Table 4). Hence, the GGM was directly applied to Landsat 7/ETM+ data (bands 1-7) for estimation of Ins-SSNR and D-SSNR, and the results were validated against field measurements. The validation measurements were collected from 12 sites via ARM, and the information is summarized in Table 10.

Conclusions
SSNR products at intermediate-to-high spatial resolution are required for various scientific studies and applications. Landsat 30-m satellite data provides a practical way for mapping SSNR at a relatively high-spatial resolution. Based on previous studies, this paper presents new robust empirical models to estimate the SSNR directly from Landsat TOA data and other ancillary information at both the instantaneous and daily scales. Observations obtained from 171 sites belonging to 10 global measuring networks during 1994-2011 were used for modeling and validation. These sites are distributed around the world and represent the most comprehensive information for such analysis (i.e., land cover types, climatic information, elevation, and so on). Four machine learning algorithms (MARS, BPNN, SVR, and GBRT) with different combinations of inputs using the complete set of measurements (global model) or several subsets based on the cloud mask or CI (conditional model) were developed for Ins-SSNR and D-SSNR, separately, following which the 2fold cross-validation method was used for model evaluation, before a final determination of the optimal model properties. The results from a modified hybrid method named WangLUT and from the GLASS product were compared, and the performances of the new model were analyzed. At the end, the extended applicability of the model for Landsat 7/ETM+ data was also evaluated.
Based on the results, the GBRT global model (GGM) with BI and WV as inputs for Ins-SSNR estimates, and with CI as input for D-SSNR estimates yielded the best results. The direct validation of the estimated Ins-SSNR and D-SSNR using Landsat 5/TM data from the GGM yielded a coefficient of determination (R 2 ) value of 0.88 and 0.94, an average RMSE of 73.23 W·m -2 (15.09%) and 18.76 W·m -2 (11.2%), an MAE of 49.19 and 13.06 W·m -2 , and a bias of 0.64 W·m -2 and -1.74 W·m -2 , respectively, which were generally better than the results from the WangLUT method. The analysis of the importance of each input in the GGM proved that the Ins-SSNR or D-SSNR estimate were sensitive to the TM band 1, BT, water vapor, and the CI. In particular, the addition of CI in D-SSNR estimation made the model yield more reasonable results, and improved the estimation accuracy significantly. Inter-comparison with the GLASS revealed that the TM D-SSNR display a very similar spatial distribution but with more details. Besides, the GGM is also appropriate for Landsat 7/ETM+ data without any alteration. Further analysis on the effects of land cover type, elevation, sky condition and different seasons demonstrated that the GGM has a wide applicability.
However, some limitations of the GGM exist. First, the GGM is an empirical model whose accuracy is largely determined by the variety and quality of the collected measurements. Second, the key parameters and architecture of the GBRT algorithm were set empirically. Third, the influence of topography was not taken into account in this study. Besides, the ancillary information used (i.e., CI

Conclusions
SSNR products at intermediate-to-high spatial resolution are required for various scientific studies and applications. Landsat 30-m satellite data provides a practical way for mapping SSNR at a relatively high-spatial resolution. Based on previous studies, this paper presents new robust empirical models to estimate the SSNR directly from Landsat TOA data and other ancillary information at both the instantaneous and daily scales. Observations obtained from 171 sites belonging to 10 global measuring networks during 1994-2011 were used for modeling and validation. These sites are distributed around the world and represent the most comprehensive information for such analysis (i.e., land cover types, climatic information, elevation, and so on). Four machine learning algorithms (MARS, BPNN, SVR, and GBRT) with different combinations of inputs using the complete set of measurements (global model) or several subsets based on the cloud mask or CI (conditional model) were developed for Ins-SSNR and D-SSNR, separately, following which the 2-fold cross-validation method was used for model evaluation, before a final determination of the optimal model properties. The results from a modified hybrid method named WangLUT and from the GLASS product were compared, and the performances of the new model were analyzed. At the end, the extended applicability of the model for Landsat 7/ETM+ data was also evaluated.
Based on the results, the GBRT global model (GGM) with BI and WV as inputs for Ins-SSNR estimates, and with CI as input for D-SSNR estimates yielded the best results. The direct validation of the estimated Ins-SSNR and D-SSNR using Landsat 5/TM data from the GGM yielded a coefficient of determination (R 2 ) value of 0.88 and 0.94, an average RMSE of 73.23 W·m -2 (15.09%) and 18.76 W·m -2 (11.2%), an MAE of 49.19 and 13.06 W·m -2 , and a bias of 0.64 W·m -2 and -1.74 W·m -2 , respectively, which were generally better than the results from the WangLUT method. The analysis of the importance of each input in the GGM proved that the Ins-SSNR or D-SSNR estimate were sensitive to the TM band 1, BT, water vapor, and the CI. In particular, the addition of CI in D-SSNR estimation made the model yield more reasonable results, and improved the estimation accuracy significantly. Inter-comparison with the GLASS revealed that the TM D-SSNR display a very similar spatial distribution but with more details. Besides, the GGM is also appropriate for Landsat 7/ETM+ data without any alteration. Further analysis on the effects of land cover type, elevation, sky condition and different seasons demonstrated that the GGM has a wide applicability.
However, some limitations of the GGM exist. First, the GGM is an empirical model whose accuracy is largely determined by the variety and quality of the collected measurements. Second, the key parameters and architecture of the GBRT algorithm were set empirically. Third, the influence of topography was not taken into account in this study. Besides, the ancillary information used (i.e., CI and WV) have a coarse spatial resolution, and may not match perfectly the site radiation observations. Moreover, Landsat 7 and 5 have a 16-day repeat cycle and provide only one observation per day; thus, the GGM is only for establishing one piece of a continuous time series of SSNR which may cause limitation of sequence analysis. All of these aspects need to be addressed in the future. In this study, MARS was first applied for Ins-SSNR and D-SSNR estimation. It was implemented on the R platform with the package "earth" [75], in which the input variables can be selected automatically. To obtain the optimal MARS model, the grid search method [68] was applied to determine the best parameter set of degree so that the minimum forecasting RMSE was generated. The backward stepwise process was carried out to train the MARS model.

• Support Vector Regression (SVR)
The SVR is a regression method using the same principles as the support vector machine (SVM) for classification, with only a few minor differences. It is proposed by [70], and although less popular than SVM, SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning approach, SVR trains using a symmetrical loss function, which equally penalizes high and low misestimates. Using Vapnik's-insensitive approach, a flexible tube of minimal radius is formed symmetrically around the estimated function, so that the absolute values of errors less than a certain threshold are ignored both above and below the estimate [76]. In this manner, points outside the tube are penalized, while those within the tube, either above or below the function, receive no penalty. One of the main advantages of SVR is that its computational complexity does not depend on the dimensionality of the input space. Additionally, it has excellent generalization capability, with high prediction accuracy. More detail about SVR can be found in [77].
In this study, SVR was implemented on the R platform with the package "kernlab" [78]. The kernel function "rbfdot" (Radial Basis kernel "Gaussian") was used in training and predicting. After looping in two parameters (C and sigma) threshold, the optimal parameterization scheme for different conditions were finally determined through the evaluation results (highest R 2 value and lowest MBE and RMSE values).
• Backpropagation Neural Network (BPNN) BPNN was used as an empirical nonlinear statistical method in a variety of applications and is the most widely applied neural network [69,70]. BPNN has proven to be an effective algorithm for estimating surface radiation budget variables, such as the incident shortwave radiation [71]. Therefore, BPNN was selected for the comparison of the performance among other methods in this study. A BPNN is a collection of electrical neurons interconnected to each other in various topologies and can be applied to any model in which the output variables are computed from the input variables. Figure A1 shows the architecture of the BPNN model used in this study, consisting of four layers of neurons: Input layer, two hidden layers, and output layer. Input x j m j=1 is transmitted through a connection that multiplies its strength by a weight represented by w ij k i=1 . This gives the value x i w ij , which is an argument to a transfer function f that yields an output y i .
where i is the index of neuron in the hidden layer and j is the index of inputs to the neural network.
The BPNN undergoes an adaptation cycle, during which the weights are updated until the network reaches a state of equilibrium. The BPNN activation function in the hidden layer was set to "logsig", the transfer function for the output layer to "purelin", and the training function to "trainlm", respectively.
In this study, the SSNR estimation utilized MATLAB Neural Net Tool Kit with the Matlab R2015b platform (The MathWorks, Natick, 2015). A typical feed-forward network trained with a resilient backpropagation algorithm [79,80] is employed in modeling. Finally, the optimal number of nodes in the hidden layers for different conditions were finally determined empirically. The GBRT is a flexible non-parametric statistical tool, firstly proposed by Friedman [71], that can be used in complex classification and prediction without pre-specifying the type of the relationship between input and response variables. Among algorithmic models, GBRT is unique in the sense that it achieves better predictive capacity than a single decision tree with large data size. The core idea of this model consists in producing a prediction model by constructing an M amount of different weak classifiers through multiple iterations in order to produce a highly accurate prediction rule. Each iteration is designed to improve the previous result by reducing the residuals of the previous model and establishing a new combination model in the gradient direction of the residual reduction. The core idea of GBRT gives it a natural advantage to discover a variety of distinguishing features and feature combinations. The general process of GBRT are shown in Yang et al. [81], and additional details of GBRT can be found in Hastie et al. [82] and Ridgeway [83].  Backpropagation Neural Network (BPNN) BPNN was used as an empirical nonlinear statistical method in a variety of applications and is the most widely applied neural network [69,70]. BPNN has proven to be an effective algorithm for estimating surface radiation budget variables, such as the incident shortwave radiation [71]. Therefore, BPNN was selected for the comparison of the performance among other methods in this study. A BPNN is a collection of electrical neurons interconnected to each other in various topologies and can be applied to any model in which the output variables are computed from the input variables. Figure A1 shows the architecture of the BPNN model used in this study, consisting of four layers of neurons: Input layer, two hidden layers, and output layer. Input is transmitted through a connection that multiplies its strength by a weight represented by . This gives the value , which is an argument to a transfer function f that yields an output .

= (9)
where i is the index of neuron in the hidden layer and j is the index of inputs to the neural network. Figure A1. Backpropagation neural network (BPNN) with multi-input-one-output architecture. Figure A1. Backpropagation neural network (BPNN) with multi-input-one-output architecture.