GLASS Daytime All-Wave Net Radiation Product : Algorithm Development and Preliminary Validation

Mapping surface all-wave net radiation (Rn) is critically needed for various applications. Several existing Rn products from numerical models and satellite observations have coarse spatial resolutions and their accuracies may not meet the requirements of land applications. In this study, we develop the Global LAnd Surface Satellite (GLASS) daytime Rn product at a 5 km spatial resolution. Its algorithm for converting shortwave radiation to all-wave net radiation using the Multivariate Adaptive Regression Splines (MARS) model is determined after comparison with three other algorithms. The validation of the GLASS Rn product based on high-quality in situ measurements in the United States shows a coefficient of determination value of 0.879, an average root mean square error value of 31.61 Wm ́2, and an average bias of  ́17.59 Wm ́2. We also compare our product/algorithm with another satellite product (CERES-SYN) and two reanalysis products (MERRA and JRA55), and find that the accuracy of the much higher spatial resolution GLASS Rn product is satisfactory. The GLASS Rn product from 2000 to the present is operational and freely available to the public.


Introduction
Surface all-wave net radiation (R n ), characterizing the available radiative energy at the Earth's surface that is usually called surface radiation budget, is the difference between total upward and total downward radiation.Mathematically, R n consists of four components: R n " R ns `Rnl , R ns " R si ´Rso " p1 ´αqR si R nl " R li ´Rlo where R ns is the net shortwave radiation, R nl is the net longwave radiation, R si is the incident shortwave radiation, R so is the reflected shortwave radiation calculated by R so = α*R si where α is shortwave broadband albedo, R li is the downward longwave radiation, and R lo is the outgoing longwave radiation.R n drives the processes of evapotranspiration and air and soil heat fluxes, as well as other smaller energy-consuming processes such as photosynthesis [1,2].R n is a critical parameter to estimate evapotranspiration [3][4][5].The net surface radiation controls the energy and water exchanges between the biosphere and the atmosphere, and has major influences on the Earth's weather and climate [6,7].Thus, reliable spatial and temporal R n information is required for many applications.However, in spite of its importance, directly measured R n is available only from a very small number of standard radiometric observatories because of the expensive instrumentation and constant maintenance needed to guarantee that reliable measurements can be provided [8]; these in situ measurements are thus unable to characterize the spatial variation.
Alternative methods for obtaining R n are meteorological reanalysis and satellite remote sensing [9].Table 1 gives detailed information about the commonly used R n products.Reanalysis products are usually derived by merging available observations with an atmospheric model to obtain the best estimate of the states of the atmosphere and land surface [10], while existing remote sensing products are generated mostly based on a radiative transfer model with inputted atmospheric and surface parameters.From Table 1, we can see that the spatial resolutions of these products are too coarse for many land applications although they are temporally continuous and globally complete.Another issue is that the accuracies of these products vary considerably and may not meet the application requirements, such as the global change research [9][10][11].Therefore, a new long-time high-resolution global R n product with both high accuracy and fine temporal-spatial resolutions is urgently needed.Many geostationary and polar-orbiting satellite data are at the kilometer spatial resolution and can be potentially used for estimating the individual components of Equation (1), such as incident shortwave radiation [22] and albedo [23,24].If all components in Equation (1) are known, the calculation of R n is straightforward [25].The difficulty in generating the global R n product using Equation ( 1) is the estimation of the thermal components under the cloudy conditions.This is the reason why many studies focus primarily only on shortwave [26][27][28] or clear-sky conditions [29].If we know the exact atmospheric and surface properties, radiative transfer models enable us to calculate surface net radiation.However, it is extremely difficult to generate a complete set of atmosphere and surface products for model calculation at a high resolution.The most practical solution is to estimate incident shortwave radiation directly from satellite observations and then convert it into all-wave net radiation using either linear [30][31][32][33][34] or nonlinear [35,36] models.Jiang et al. [37] recently developed two artificial neuron networks models using comprehensive global in situ observations and found that the nonlinear models can produce better accuracy than the linear models.
The primary objective of this study is to generate the accurate long-term high-resolution Global LAnd Surface Satellite (GLASS) R n product.The GLASS product suite is for the long-term environmental change studies [38,39] and continuously expanding.For generating such a global product, algorithm development is the key and we must therefore balance the accuracy and computational efficiency of the algorithm.To achieve this objective, we explore two new nonlinear models: Multivariate Adaptive Regression Splines (MARS) and Support Vector Regression (SVR).After comparing them with two models developed earlier [17,20], we select the MARS model to produce the GLASS R n product at a 5 km spatial resolution.The resulting high-resolution GLASS R n product is further validated, compared with one satellite product and two reanalysis products.The details are presented in the following sections.

Data
The data used in this study are comprised of in situ radiation measurements, remote sensing products, and meteorological reanalysis data.The remote sensing products and reanalysis data were used to map R n on a global scale.Based on the characteristics of these data, multiple trial experiments and pre-processing with strict quality control were conducted, and data were aggregated to a daytime or diurnal scale.The variables considered in this study are shown in Table 2.The readers are referred to Jiang et al. [37] for more information about these data.As described above, more than 8000 validation samples were selected from the observations made in 2008, and most of them were from 25 sites.Table 3 gives detailed information about these sites, which were mainly from the ARM (http://www.arm.gov/) and SURFRAD (http://www.srrb.noaa.gov;[40]) observation networks.This ensures that the data are of the best quality currently available.The distribution of the 25 sites is shown in Figure 1.All ARM and SURFRAD data are carefully checked, and quality-control information is supplied with the data (ftp://aftp.cmdl.noaa.gov/data/radiation/surfrad/)and http://www.archive.arm.gov/;[41]).

Methods
The overall flowchart of this study is shown in Figure 2. First, the GLASS Rn algorithm was determined by comparing four Rn estimation models (Section 3), including the multivariate adaptive regression splines (MARS), the support vector regression (SVR) model, a linear regression model (LM) [21], and a general regression neural network (GRNN) model [20].To the best of our knowledge, the first two models are the first to be applied for estimating surface net radiation.The details of the MARS model are introduced in this section, while the other three models are outlined in the appendix.After validation and inter-comparison, the MARS model was determined to generate the GLASS daytime Rn product.Second, the MARS model was applied to the all-wave daytime Rn product by converting the GLASS Rsi product using other satellite products (e.g., NDVI and albedo)

Methods
The overall flowchart of this study is shown in Figure 2. First, the GLASS R n algorithm was determined by comparing four R n estimation models (Section 3), including the multivariate adaptive regression splines (MARS), the support vector regression (SVR) model, a linear regression model (LM) [21], and a general regression neural network (GRNN) model [20].To the best of our knowledge, Remote Sens. 2016, 8, 222 5 of 17 the first two models are the first to be applied for estimating surface net radiation.The details of the MARS model are introduced in this section, while the other three models are outlined in the Appendix.After validation and inter-comparison, the MARS model was determined to generate the GLASS daytime R n product.Second, the MARS model was applied to the all-wave daytime R n product by converting the GLASS R si product using other satellite products (e.g., NDVI and albedo) and meteorological information from MERRA (Section 4.1).Finally, the accuracy of the new GLASS R n product was evaluated by validating it against the site observations and comparing it with other R n products: one remotely sensed product (CERES-SYN), and two reanalysis products (MERRA and JRA55) (Section 4.2).

Remote Sens. 2016, 8, 222
In-situ Flowchart of this study, the explanations of the variables are given in Table 2.
MARS is a nonlinear and nonparametric regression model proposed by Friedman [42].MARS is a generalization of the stepwise linear regression procedure for fitting an adaptive nonlinear regression to data.It is more flexible in modeling relationships that are nearly additive or involve interactions with variables.MARS uses expansions in piecewise linear basis functions of the form: Flowchart of this study, the explanations of the variables are given in Table 2.
MARS is a nonlinear and nonparametric regression model proposed by Friedman [42].MARS is a generalization of the stepwise linear regression procedure for fitting an adaptive nonlinear regression to data.It is more flexible in modeling relationships that are nearly additive or involve interactions with variables.MARS uses expansions in piecewise linear basis functions of the form: with x = t being a knot (linear splines).The smoothing function f is a linear expansion of the basic functions, f pxq " where h j px j q are the piecewise linear basis functions and θ j are the coefficients that are estimated by minimizing the residual sum-of-squares using standard linear regression.In this study, MARS was first applied for daytime R n estimation.It was implemented on the R platform with the package "mda," in which the input variables can be selected automatically.After extensive experiments, the maximum interaction degree between variables in MARS was set to 2, and the backward stepwise process was carried out to train the MARS model.

GLASS Daytime R n Algorithm
The four models (LM, MARS, GRNN, and SVR) were trained one by one with half of the total number of in situ measurements and their corresponding reanalysis and satellite datasets (Figure 2).Four predictions were then produced by the other part of the independent validation dataset and compared.The results are shown in Figure 3 and summarized in Table 4. Three measurements of the fitting statistics were compared: R-square (R 2 ), root mean square error (RMSE), and bias.The computational times are also given in Table 4 for better comparison.In the present study, all the models were implemented under the Microsoft Windows 7 system on a Intel Core 3.20 GHz PC with 8 GB memory.
Based on these comparison results (mainly R 2 and RMSE values since the bias values are relatively small), it is clear that the predictive abilities of the GRNN and SVR models were similar and also better than those of the other two models.The LM model performed the worst and the MARS model's performance was intermediate.However, the computational times for model training and fitting differed considerably between the four models (see Table 4).In short, the computational efficiencies of the GRNN and SVR models were very low when large datasets were applied, and these two models are unsuitable for generating the long-term GLASS daytime R n product.Extensive experiments were then performed to determine if the sample sizes could be reduced without decreasing the data fitting accuracy using these two machine learning models.Taking the SVR model as an example, it was found that the fitting accuracy linearly decreased when the training samples were reduced in size (Table 5), resulting in a data fitting accuracy very close to the result of the MARS model (Table 4) when the time for training and fitting was acceptable.The situation for the GRNN model was similar.Given the trade-off between computational time and prediction accuracy, the MARS model was accepted as the first option for GLASS daytime R n production.

6
compared.The results are shown in Figure 3 and summarized in Table 4. Three measurements of the fitting statistics were compared: R-square (R 2 ), root mean square error (RMSE), and bias.The computational times are also given in Table 4 for better comparison.In the present study, all the models were implemented under the Microsoft Windows 7 system on a Intel Core 3.20 GHz PC with 8 GB memory.To achieve a better understanding of the applicability of the MARS model, all samples were grouped into four categories in accordance with Jiang et al. [43], and the prediction accuracy of the MARS model for these four categories was compared.Jiang et al. [43] found that NDVI = 0.2 can be used as the threshold to identify vegetated surfaces, and three more classes can be roughly divided based on albedo according to the different relations between R si and R n when NDVI < 0.2 (see Table 6).The comparison fitting results are shown in Table 7.In general, the four categories correspond to the major land cover types found on Earth: S1-wetland; S2-desert or barren land with sparse vegetation; S3-snow/ice; and S4-the remaining vegetated surfaces.Furthermore, the seasonal information can also be represented by these categories.The results shown here were similar to our previous study [43] whereby the simulation accuracies are much better for S1 and S4, because the R si is the dominant factor for R n for these two categories.Keep in mind that the statistical values are considerably different for snow/ice surfaces (S3) due to the high albedo and clustering of all points.The results proved the robustness of the MARS model in R n estimation.

GLASS Daytime R n Product
After the MARS model was chosen as the GLASS daytime R n algorithm, the global GLASS daytime R n product was generated and evaluated.For the global data production, the GLASS R si product was applied as the input instead of the in situ R si measurements.The GLASS R si product was generated from multiple polar-orbiting and geostationary satellite datasets with a look-up table algorithm.The validation results demonstrated that this product was superior to other products [44].The instantaneous 3-hourly GLASS R si was first aggregated into a daytime temporal scale, then the climatic factors from the MERRA reanalysis product were resampled into 5 km spatial resolution to match the GLASS R si .The GLASS NDVI and GLASS ABD products were also used as inputs.Their spatial resolution is 5 km but temporal resolution is eight-day; therefore, each day during the eight-day period was set to be the same, assuming little variation at the surface over the period.
More than 8000 samples from 25 sites in 2008 were extracted for independent validation of the GLASS daytime R n product.All other samples from 251 sites were used for the MARS model training.The MARS model was then used to produce the GLASS daytime R n product.Finally, the GLASS daytime R n product from 2000 to the present with spatial resolution 0.05 deg (~5 km) in global coverage was generated in this study.
Figure 4b shows the GLASS daytime R n in day 274 of 2008 (30 September 2008), a randomly selected date.The MERRA daytime R n for the same day is also shown in Figure 4a for comparison.From the two plots, we concluded that the spatial distribution of R n is visually similar between the two products, but large discrepancies occur over many regions, such as the northern part of South America and over the Chinese mainland.In addition, the GLASS daytime R n product gives more details compared with the MERRA dataset because of its higher spatial resolution.To identify which product is more accurate, more validation results of the two products compared with the site observations will be provided in Section 4.2.Note the missing values over part of the Arctic region and the entire Antarctic region due to the missing inputs of the GLASS R si product.Figure 4b shows the GLASS daytime Rn in day 274 of 2008 (30 September 2008), a randomly selected date.The MERRA daytime Rn for the same day is also shown in Figure 4a for comparison.From the two plots, we concluded that the spatial distribution of Rn is visually similar between the two products, but large discrepancies occur over many regions, such as the northern part of South America and over the Chinese mainland.In addition, the GLASS daytime Rn product gives more details compared with the MERRA dataset because of its higher spatial resolution.To identify which product is more accurate, more validation results of the two products compared with the site observations will be provided in Section 4.2.Note the missing values over part of the Arctic region and the entire Antarctic region due to the missing inputs of the GLASS Rsi product.

Validation and Comparison
For further evaluation, one remotely sensed product (CERES-SYN) and two model reanalysis products (MERRA and JRA55) were used for inter-comparison with the GLASS Rn product; spatiotemporal information about the datasets is given in Table 1.The CERES-SYN product is obtained by merging CERES observations aboard the NASA Terra and Aqua satellites with radiances observed from five geostationary satellites [18,45].The MERRA product is provided by NASA's Global Modeling and Assimilation Office (GAMO); it is designed to integrate with NASA's Earth Observing System (EOS) satellite data for use in climate analysis [46].JRA55 is offered by the Japan Meteorological Agency (JMA) by using the TL319 version of JMA's operational data assimilation system in which several newly available and improved past observations were used.JRA55 is recognized as an upgrade to JRA 25 and is considerably better than JRA25 [16,47].After transferring to the local time of each pixel, the three products were integrated into a daytime temporal scale according to the sunrise and sunset time, and then, along with the GLASS daytime Rn product, they were validated against the respective in situ measurements; the results are shown in Figure 5.

Validation and Comparison
For further evaluation, one remotely sensed product (CERES-SYN) and two model reanalysis products (MERRA and JRA55) were used for inter-comparison with the GLASS R n product; spatio-temporal information about the datasets is given in Table 1.The CERES-SYN product is obtained by merging CERES observations aboard the NASA Terra and Aqua satellites with radiances observed from five geostationary satellites [18,45].The MERRA product is provided by NASA's Global Modeling and Assimilation Office (GAMO); it is designed to integrate with NASA's Earth Observing System (EOS) satellite data for use in climate analysis [46].JRA55 is offered by the Japan Meteorological Agency (JMA) by using the TL319 version of JMA's operational data assimilation system in which several newly available and improved past observations were used.JRA55 is recognized as an upgrade to JRA 25 and is considerably better than JRA25 [16,47].After transferring to the local time of each pixel, the three products were integrated into a daytime temporal scale according to the sunrise and sunset time, and then, along with the GLASS daytime R n product, they were validated against the respective in situ measurements; the results are shown in Figure 5.

9
Modeling and Assimilation Office (GAMO); it is designed to integrate with NASA's Earth Observing System (EOS) satellite data for use in climate analysis [46].JRA55 is offered by the Japan Meteorological Agency (JMA) by using the TL319 version of JMA's operational data assimilation system in which several newly available and improved past observations were used.JRA55 is recognized as an upgrade to JRA 25 and is considerably better than JRA25 [16,47].After transferring to the local time of each pixel, the three products were integrated into a daytime temporal scale according to the sunrise and sunset time, and then, along with the GLASS daytime Rn product, they were validated against the respective in situ measurements; the results are shown in Figure 5. Based on the results, the GLASS and CERES-SYN daytime Rn products were superior to the other two reanalysis products, MERRA and JRA55.By comparing the average RMSE and bias values, it becomes clear that the GLASS daytime Rn product is better than the CERES-SYN product.The average RMSE of GLASS daytime Rn was 31.61Wm −2 and the average bias was −17.59 Wm −2 , compared to 35.58 Wm −2 and −31.00 Wm −2 for CERES-SYN's average RMSE and average bias.Comparison also showed that the GLASS product had increased low values and decreased high values (Figure 5a), whereas the CERES-SYN product was larger overall than the measurements (Figure 5b).In addition, the validation accuracy of each site between GLASS and the measurements are shown in Table 8.The R 2 values of most sites were larger than 0.85, the RMSE values were mostly around 30 Wm −2 , and the bias values were chiefly smaller than 25 Wm −2 , thereby demonstrating that the accuracy of the GLASS daytime Rn product is satisfactory with most applications.Based on the results, the GLASS and CERES-SYN daytime R n products were superior to the other two reanalysis products, MERRA and JRA55.By comparing the average RMSE and bias values, it becomes clear that the GLASS daytime R n product is better than the CERES-SYN product.The average RMSE of GLASS daytime R n was 31.61Wm ´2 and the average bias was ´17.59 Wm ´2, compared to 35.58 Wm ´2 and ´31.00 Wm ´2 for CERES-SYN's average RMSE and average bias.Comparison also showed that the GLASS product had increased low values and decreased high values (Figure 5a), whereas the CERES-SYN product was larger overall than the measurements (Figure 5b).In addition, the validation accuracy of each site between GLASS and the measurements are shown in Table 8.The R 2 values of most sites were larger than 0.85, the RMSE values were mostly around 30 Wm ´2, and the bias values were chiefly smaller than 25 Wm ´2, thereby demonstrating that the accuracy of the GLASS daytime R n product is satisfactory with most applications.Two site examples are also given in Figure 6.Here, the CERES-SYN product was selected because of its best performance among the three products.The two plots show that the GLASS daytime R n matched the site observations very well and that the CERES-SYN product was larger than the measurements and the GLASS product although the variations were reasonable.Overall, the GLASS daytime R n product has the potential to be one of the best R n products available for future applications.

Summary
A new Rn product that offers high spatiotemporal resolution, high accuracy, and global coverage over long time periods is urgently needed for a variety of applications.To achieve this goal, we developed the GLASS daytime Rn product.To determine the GLASS daytime Rn production algorithm, four models (LM, MARS, GRNN, and SVR) that convert incident shortwave radiation to all-wave net radiation were trained for Rn estimation and validated with high-quality measurements

Summary
A new R n product that offers high spatiotemporal resolution, high accuracy, and global coverage over long time periods is urgently needed for a variety of applications.To achieve this goal, we developed the GLASS daytime R n product.To determine the GLASS daytime R n production algorithm, four models (LM, MARS, GRNN, and SVR) that convert incident shortwave radiation to all-wave net radiation were trained for R n estimation and validated with high-quality measurements made in the United States.The validation results indicate that the GRNN and SVR models had the best prediction accuracy over the other two empirical models, although it was unacceptably time consuming, and that the performance of the MARS model is promising.A further experiment also demonstrated that the MARS model is robust under various conditions.Therefore, as a result of the trade-off between the practical requirements of applications and data fitting accuracy requirements, the MARS model was selected as the final GLASS daytime R n product algorithm.Finally, a global coverage GLASS daytime R n product with a 5 km spatial resolution and daytime temporal resolution in 2008 was generated using the MARS model.
The new daytime R n product was validated against measurements from 25 independent sites, and was also compared with one remotely sensed R n product, CERES-SYN, and two reanalysis R n products, MERRA and JRA55.The validation results illustrate that the new GLASS daytime R n product delivers more detail at the global scale due to its relatively high spatial resolution and does so without spatial gaps, except for the Arctic and Antarctic regions, and that it has a continuous time series because the all-sky conditions were considered in the MARS algorithm and the GLASS incident shortwave radiation product.The validation results of the GLASS daytime R n product at the 25 sites were very satisfactory with most applications, with an overall coefficient of determination of 0.88, an average RMSE of 31.61Wm ´2, and an average bias of 17.59 Wm ´2.The results of comparing the GLASS with three other products also proved that this new daytime R n product performed much better than the two reanalysis products and similar to CERES-SYN but with a much higher spatial resolution.Overall, the results in the present study show that the GLASS daytime R n product generated by the MARS model is superior to other presently available products.
Although it is common practice in the literature, validating satellite products and reanalysis datasets at a spatial resolution scale from a few to hundreds of kilometers using "point" ground measurements directly provides questionable results.It is valid only if the atmospheric and surface conditions are homogeneous.An upscaling process using intermediate-resolution products is necessary for many heterogeneous landscapes and atmospheric conditions [48].A further project conducted by us for addressing the scaling issue is underway, and the results will be presented in the near future.
where D = 5.31 ˆ10 ´13 Wm ´2K ´6 is an empirical constant suggested by Swinbank [50] , a, b, c, d, and e are regression coefficients, T a,K " T a, ˝C `273.15 is the absolute mean air temperature, and σ is Stefan-Boltzmann constant (5.67 ˆ10 ´8WK ´4m ´2).The definition of the other variables can be found in Table 2.

B. General Regression Neural Network (GRNN) Model
The GRNN is a generalization of radial function networks and probabilistic neural networks developed by Specht [51].Jiang et al. [37] applied the GRNN model for daytime R n estimation, and the evaluation results proved that this model worked very well and was stable under various conditions.The architecture of the GRNN model used in this study is shown in Figure A1.The GRNN has a multi-input-output (one output in this study) architecture, and includes four layers: the input layer, the pattern layer, the summation layer, and the output layer.The input layer provides all of the variables to the neurons in the pattern layer; each neuron represents a training pattern, and its output is a measure of the distance of the input from the stored patterns.The summation layer has two types of summation neurons: one to compute the sum of the weighted outputs of the pattern layer, and the other to calculate the unweighted outputs of the pattern neurons.Finally, the output layer performs a normalization step to yield the predicted value of the output variable.In accordance with Jiang et al. [37] and Xiao et al. [52], the Gaussian kernel function was used for GRNN training in the present study, and the smoothing parameter in the kernel function was the only free parameter that needed to be determined.Thus, GRNN training was essentially optimization of the smoothing parameter and the architecture and weights of a GRNN were determined when the input was given.More details for the optimal smoothing parameter selection can be found in Xiao et al. [52].Because all of the parameters in a GRNN can be determined automatically, the entire training dataset was used for GRNN training, as was the case for the LM and MARS models, and all of the inputs were linearly scaled before training, as was done by Jiang et al. [37].The GRNN modeling was implemented on the C platform.

13
More details for the optimal smoothing parameter selection can be found in Xiao et al. [52].Because all of the parameters in a GRNN can be determined automatically, the entire training dataset was used for GRNN training, as was the case for the LM and MARS models, and all of the inputs were linearly scaled before training, as was done by Jiang et al. [37].The GRNN modeling was implemented on the C platform.

C. Support Vector Regression (SVR) Model
The SVR is a method proposed by [53] to solve regression problems using support vector machines (SVM).Intuitively, SVR works by performing a nonlinear mapping of the data from the input space to a higher dimensional feature space where linear regression can then be performed.In the case of a linear SVR model, if the training data are of the form tpx 1 , y 1 q, px 2 , y 2 q, ¨¨¨, px n , y n qu (x i P R d , y i P R, and n is the sample number), then the solution function takes the form: where x¨, ¨y represents the dot product of two points and the variables α i , α i , and b are calculated by the SVR algorithm.It is noted that only some of the patterns will have an impact on the final solution when the term pα i ´αi q is nonzero, and these nonzero patterns are referred to as the support vectors.However, the input data must first be transformed into a higher dimensional feature space using a nonlinear mapping function when the SVR model is nonlinear.In accordance with Mercer's theorem, a kernel function was used without ever explicitly computing the mapping, then the original solution function was changed according to the following equation: pα i ´αi qpα j ´αj qkxx i , x j y ´ε n ř i"1 pα i `αi q `n ř i"1 y i pα i ´αi q (A4) Subject to n ř i"1 pα i ´αi q " 0 and α i , α i P r0, Cs where x i and y i are the input-output pairs of the training data, α i and α i are the variables to be discovered, and ε and C are constants.The constant ε represents the SVR algorithm's tolerance for errors.The area within ˘ε of the learned function is referred to as the SVR regression tube, and any errors that fall within this tube are ignored.The constant C is referred to as the penalty factor, which controls the trade-off between the complexity of the function and the frequency with which errors are allowed to fall outside of the SVR regression tube.The constants ε, C, and the parameter to be set in the kernel function are usually called "hyper-parameters"; previous studies proved that the simulation results of SVR were sensitive to the selection of the hyper-parameters.
In the present study, the "eps-regression" was selected as the SVR regression type, and the radial basis kernel function was used.Thus, three parameters (ε, C, and γ) must be determined for SVR training.To obtain the optimal SVR model, the training dataset was further randomly divided into 80% and 20% proportions for model building and testing, respectively, and the step-by-step search method was applied to model building to obtain the optimal hyper-parameters.Thus, the range of hyper-parameters was predefined as (ε P r0.01, 1s, C P r1, 100s, and γ P r0.01, 1s).Several combinations of the hyper-parameters were then tried for SVR building and testing, and finally the optimal combination was determined based on which combination provided the best testing accuracy.SVR modeling was also implemented on the R platform with the"e1071" package [54], and all inputs were Z-score normalized before training.

Figure 1 .
Figure 1.Distribution of the 25 validation sites in the United States.

Figure 1 .
Figure 1.Distribution of the 25 validation sites in the United States.
CI, BI) Remotely Sensed data (GLASS ABD, NDVI) Input datasets Four-model training (LM, MARS, GRNN, and SVR was trained with 50% In-situ measurements) Four-model comparison (Another 50% samples for independent validation) GLASS R n model training (All in-situ measurements exclude the year of 2008) GLASS R n algorithm determined (Validation results and computational efficiency considered) Global GLASS daytime R n production (2008) Validation (Validated against the observations from 25 sites in ARM and SURFRAD) Comparison (Compared with one remotely sensed product CERES-SYN, and two reanalysis products MERRA and JRA55)

Figure 3 .
Figure 3.The scatterplots of the comparison results of the predictions from the four models: (a) LM; (b) MARS; (c) GRNN; (d) SVR.

8 daytime
Rn product from 2000 to the present with spatial resolution 0.05 deg (~5 km) in global coverage was generated in this study.

Figure 4 .
Figure 4.The MERRA (a) and GLASS (b) daytime Rn products in day 274 (30 September) of 2008.The white spaces indicate the missing data.

Figure 4 .
Figure 4.The MERRA (a) and GLASS (b) daytime R n products in day 274 (30 September) of 2008.The white spaces indicate the missing data.

Figure 6 .
Figure 6.Temporal profiles of site measurements, GLASS, and CERES-SYN values of the ARM_E01 site (a) and SF_SXF site (b) for the year 2008.

Figure 6 .
Figure 6.Temporal profiles of site measurements, GLASS, and CERES-SYN values of the ARM_E01 site (a) and SF_SXF site (b) for the year 2008.

Figure A1 .
Figure A1.General regression neural networks (GRNN) with multi-input-one-output architecture.The inputs x i (i = 1, . . ., n) are shown in Table2, and the output y represents R n .

Table 1 .
Characteristics of the commonly used R n datasets.

Table 2 .
Variables explanation and source.
R si * stands for the GLASS R si product and was used for global GLASS daytime R n production in this study.

Table 3 .
Information about the 25 validation sites.

Table 4 .
The statistic results of the four models.

Table 5 .
Fitting accuracy in SVR with different sample sizes.

Table 6 .
Four classifications based on combinations of Normalized Difference Vegetation Index (NDVI) and albedo (see Table2, with their corresponding numbers of samples.

Table 7 .
Validation statistics for the MARS model for these four categories.

Table 8 .
Validation results between GLASS daytime Rn and the observations of 25 validation sites.
Site R

Table 8 .
Validation results between GLASS daytime R n and the observations of 25 validation sites.