Feasibility of Estimating Cloudy-Sky Surface Longwave Net Radiation Using Satellite-Derived Surface Shortwave Net Radiation

: Surface longwave net radiation (LWNR) is a vital component in the surface radiation budget. Major progress has been made in the estimations of clear-sky LWNR. However, the estimation of cloudy-sky LWNR remains a signiﬁcant challenge. In this paper, a linear model (LM) and a multivariate adaptive regression spline (MARS) model were developed to estimate the cloudy-sky LWNR from a satellite-derived surface shortwave net radiation product. Spatially and temporally matched satellite data and ground-measured LWNR, which was collected at 24 sites from four networks, were used to build and validate the linear and MARS models. The effects of land cover, climate type, and surface elevation on the estimate of LWNR were also analyzed. The MARS model, incorporating the normalized difference vegetation index (NDVI) and surface elevation (H) as the inputs, had the best performance. The determination coefﬁcient, BIAS, and root mean square error (RMSE) were 0.51, 0.01 W/m 2 , and 26.10 W/m 2 , respectively. The developed model, when combined with freely distributed Global LAnd Surface Satellite (GLASS) products, showed promise for producing surface LWNR and all-sky surface net radiation.


Introduction
Surface net radiation plays a fundamental role in determining atmospheric and oceanic thermal conditions and circulations, which shape the main characteristics of the earth's climate. Surface net radiation is also a necessary input for land surface models that characterize the hydrological, ecological, and biogeochemical processes on earth [1][2][3]. Surface net radiation is the sum of shortwave net radiation and longwave net radiation (LWNR) [4], which can be expressed by the following equations: where R n is the surface net radiation, R nS is the shortwave net radiation, R nL is the LWNR, R Sd is the shortwave downward radiation, R Su is the shortwave upward radiation, R Ld is the longwave

Satellite Data
Global LAnd Surface Satellite (GLASS) products are a series of freely available, long-term, high-quality datasets that include leaf area index (LAI), shortwave broadband albedo (ABD), broadband emissivity (BBE), downwelling shortwave radiation (DSR), and photosynthetically active radiation (PAR) at the first stage [31]. There are twelve products in GLASS, which were gradually distributed. GLASS DSR and GLASS ABD products from 2008-2010 were used in this study to estimate surface shortwave net radiation. The GLASS DSR product, with a three-hour temporal resolution and a five-kilometer spatial resolution, was produced from multi-source remote sensing data, using a look-up table algorithm [32]. The GLASS ABD was produced by integrating two intermediate albedo products that were derived by a direct-estimation algorithm with a statistics-based temporal filter (STF) [33,34]. Thus, it was more complete and accurate than the intermediate albedo products. The spatial and temporal resolutions were 1 km and eight days, respectively [34]. The albedo was assumed to be unchanged during the 8-day period. Previous studies have noted that GLASS DSR and ABD products have higher accuracy than mainstream products [35,36]. Furthermore, the MODIS vegetation indices product MOD13A2 was also used to describe the land characteristics. MOD13A2 data were provided every 16 days at a 1 km spatial resolution as a gridded level 3 product in the sinusoidal projection. Each day during the 16-day period was set to be the same as the GLASS ABD.

Ground Measurements
The most popular instrument that is used to directly measure SLUR and SLDR is called a pyrgeometer, which has evolved into the precision infrared radiometer (PIR) [1]. It is relatively expensive and sensitive when compared to the currently used pyranometers, which are employed to measure surface shortwave radiation [20]. Therefore, the measurements of SLUR and SLDR are not as common as the measurement of shortwave radiation. To support scientific research, several surface radiation observation networks that provide continuous surface longwave radiation measurements have been established, and the ground-based data are publicly available. Ground-based measurements of SLUR and SLDR, which were collected from 24 sites in four networks, were used in this study in order to develop and validate the models. These included nine sites from the Arid and Semi-Arid Region Collaborative Observation Project (ASRCOP) network [36], six sites from the Surface Radiation (SURFRAD) network [37], six sites from the AmeriFlux network [38], and three sites from the Baseline Surface Radiation Network (BSRN) [39]. All of the data underwent a manual visual inspection before they were used. Table 1 summarizes the site information, which includes latitude, longitude, elevation, land cover, climate type, and observation period. The types of land cover at the sites include desert, cropland, grassland, and forest. The elevations of the sites vary from 25 m to 3033 m. The data period covers 2008-2010. The spatial distribution of these sites is shown in Figure 1.

Methods
According to Equation (1), LWNR can be expressed as follows: Previous studies have shown that surface net radiation can be estimated directly from surface shortwave radiation. Kaminsky and Dubayah explored the relationship between surface net radiation and shortwave fluxes in central Canada. They suggested that a single linear equation, using surface net shortwave radiation, can be used to estimate surface net radiation [40]. Alados et al. did similar work in a semi-arid area [41]. Therefore, the right-hand side of Equation (4) can be expressed only by R ns . This means that we can estimate LWNR by R ns . As surface shortwave net radiation can be precisely estimated under all weather conditions, we explored the relationship between LWNR and surface shortwave net radiation to estimate LWNR under a cloudy sky. This allows for measurements of cloud information to be bypassed, including cloud-top temperature, cloud-top height, and cloud optical depth, which is unstable. A linear model and a multivariate adaptive regression splines (MARS) model were used to predict LWNR. In addition, NDVI was also incorporated into the developed models to qualify the effect of vegetation on the estimation of LWNR.

Linear Model
The linear model (LM) is expressed as follows: When the NDVI was considered, the linear model becomes the following: where a 1 , b 1 , a 2 , b 2 , and c 2 are regression coefficients; α is the shortwave broadband albedo derived from GLASS ABD; R Sd is the surface shortwave downward radiation derived from GLASS DSR; and NDVI is derived from MOD13A2. To fit the above equations, spatial and temporal matching between satellite data and site measurements were implemented through three steps: (1) Spatial matching. The location of each site was determined by matching the site's coordinates and the geolocation fields of the images; (2) Cloudy-sky identification. The cloudy-sky condition of the sites was determined by cloud fraction c. The calculation for c was referenced from Carmon et al. [19]. When c was greater than 0.05, the sky was considered to be cloudy; (3) Temporal matching. The timing of the satellite data and measurements was matched using the nearest-neighbor method to obtain time-matched data. Following the above processes, more than 90,000 samples were obtained. The samples were randomly divided into two parts. Two-thirds of the samples were used to fit the coefficients in Equations (5) and (7) by linear regression fitting, and one-third of the samples were used to validate the developed models.

MARS Model
Multivariate adaptive regression spline (MARS) is a form of regression analysis that was introduced by Jerome H. Friedman in 1991 [42]. MARS builds models in the following form: where B i (x) is a basis function, which is either a constant or has the form max(0, x − const) or max(0, const − x), and c i is a constant coefficient. MARS is a non-parametric regression technique, in which nonlinear responses between variables are described by a series of linear segments of different slopes, each of which is fitted using a basis function. The breaks between segments are defined by a knot in a model that initially over-fits the data and then is simplified using a backwards/forwards stepwise cross-validation procedure to identify the terms that are to be retained in the final model. This produces continuous models with continuous derivatives, and has more power and flexibility to model relationships that are nearly additive or involve interactions between a few variables at most.
MARS was applied for LWNR estimation in this study. To explore the nonlinearity among LWNR, net shortwave radiation, and NDVI, we used MARS to establish the nonlinear relationship using the same samples as in Section 3.1. This was implemented on the MATLAB platform with a tool called ARESLab. First, we used the function aresev to select the number of basis functions using cross-validation. Figure 2 shows the change in mean square error (MSE) for the models of each fold as the number of basis functions increases. The ten dotted pink lines indicate the MSE for the models of each fold. The solid pink line is the mean MSE for each model size. The circle and vertical dashed lines are at the minimum of the solid lines, which represent the optimum number of basis functions. As shown in Figure 2, the number of basis functions was set to 11. We then used the two part samples from Section 3.1 to develop and validate the MARS model.

Validation of the LM Model
Using the method described in Section 3, we obtained 92,220 samples. The coefficients in Equation (5) were fitted by linear regression using two-thirds of the samples: The BIAS and root mean square error (RMSE) were used as the primary indicators of accuracy. In addition to the BIAS and RMSE, the determination coefficient (R 2 ) was used as an indicator to test the performance of the developed model. The training results are presented in Figure 3a. The R 2 was 0.34, and the BIAS and RMSE were 0.006 W/m 2 and 30.22 W/m 2 , respectively. N is the number of samples for training purposes. An evaluation of the model's accuracy was then implemented using the remaining one-third of samples that were extracted. The validation results are presented in Figure 3b. N is the number of samples for validation purposes. The R 2 was 0.34, and the BIAS and RMSE were 0.02 W/m 2 and 30.19 W/m 2 , respectively. The LWNR values were overestimated at the low end of the LWNR range-where the ground-measured LWNR values were approximately −150 W/m 2 -while the LWNR values were underestimated at the high end of the LWNR range-where the ground-measured LWNR values were approximately 0. In addition, there was a ceiling for the linear model. This may have been because the surface shortwave net radiation was not enough to estimate LWNR.
In addition to clouds, many factors can affect the surface radiation budget, including near-surface air temperature and humidity, soil moisture, and land cover type. However, most of these parameters cannot be obtained directly by optical-thermal satellite under cloudy conditions. The NDVI from MOD13 was used to provide additional land surface information. We then obtained a new linear model (LM-NDVI) with two variables as follows: The training results are presented in Figure 4a. The R 2 was 0.37, and the BIAS and RMSE were 0.013 W/m 2 and 29.56 W/m 2 , respectively. One-third of the samples were then extracted and used to validate the LM-NDVI model. The evaluation results of LM-NDVI are presented in Figure 4b. The R 2 was 0.37, and the BIAS and RMSE were 0.08 W/m 2 and 29.54 W/m 2 , respectively. Compared with the LM model, the LM-NDVI model did not improve significantly, but the distribution of scatters appears to be more reasonable. There is no obvious upper limit for LM-NDVI.

Validation of the MARS Model
The MARS model was trained with the same samples that were used in Section 4.1. Figure 5a presents the training results of the MARS model. The validation process was performed to evaluate the MARS model. These results are presented in Figure 5b. The performance of the MARS model was slightly better than that of the LM model. For the validation results, the determination coefficient was 0.36, and the BIAS and RMSE values were 0.02 W/m 2 and 29.86 W/m 2 , respectively. Similar to the LM model, the obvious maximum and minimum values of the retrieved LWNR were −20 and −120, respectively.
For the same reason as in Section 4.1, NDVI was considered and inputted into the MARS model for new LWNR predictions (MARS-NDVI model). Figure 6 plots the scatter density of the ground measurements against the MARS-NDVI model-predicted LWNR using the training samples and validation samples, respectively. The validation results were almost the same as the training results. This indicates that the training samples were sufficient. The validation results indicated that the LWNR predictions were well-related to the ground measurements, with a determination coefficient of 0.46, and small BIAS and RMSE values of 0.08 W/m 2 and 27.49 W/m 2 , respectively. Moreover, the MARS-NDVI model performed much better than the MARS model, indicating that the NDVI was a factor that could not be neglected and needed to be incorporated into the LWNR estimations. Meanwhile, the MARS-NDVI model had a better performance than the LM-NDVI model, which indicated that the nonlinear model was more robust than the linear model.  Vegetation has obvious seasonal variations and may have certain effects on different LWNR estimation models. Therefore, in order to further investigate the adaptabilities of different LWNR estimation models among different seasons, the samples were divided into four parts-namely, spring, summer, autumn, and winter-to test the four developed models. The results are presented in Figure 7 and summarized in Table 2. The results indicated that the LWNR predictions had obvious seasonal differences, without considering the NDVI. Both the LM and MARS performed the worst in summer, with the largest RMSE of greater than 30 W/m 2 . These models had better performances in winter. This is because less surface shortwave net radiation was transformed into longwave radiation as a result of the transpiration of the vegetation. The correlation between surface shortwave net radiation and LWNR weakened with the increase in vegetation coverage. Therefore, the seasonal differences were significantly decreased when considering NDVI.  The validation accuracy of the developed MARS-NDVI model was comparable to the accuracy of the general model from Zhou et al. [29], which is reported to have a BIAS and RMSE of −2.31 and 29.25 W/m 2 , respectively, at nine AmeriFlux sites. The developed MARS-NDVI model is a promising method for estimating LWNR using current satellite products (e.g., GLASS SDR and ABD, MODIS, and NDVI).

Discussion
Earth-atmosphere interactions could be affected or modulated by many factors, such as land cover, climate type, and surface elevation. Therefore, we attempted to investigate the influence of these factors on the performances of the above four developed models in this section. First, the land surface was divided into four primary land cover types-namely, desert, cropland, grassland, and forest. Then, the accuracy of each model for the different land cover types was evaluated. The evaluation results are presented in Figure 8a,b. The statistical results are provided in Table 3. All four of the models performed much worse over desert than they did over vegetated areas, with an RMSE of approximately 40 W/m 2 . This may be related to the characteristics of high albedo and low specific heat capacity in arid areas [43].
Based on the Koppen climate classification, we divided the samples into five groups: BS (semi-arid climate), BW (desert climate), Cf (temperate climates), Df (continental climates), and Dw (dry winter continental climates). We then evaluated the performance of the four models over the different climate types. The results are presented in Figure 8c,d. All of the models performed better over humid climate types (Cf, Df) than they did over dry climate types (BS, BW, and Dw). As shown in Table 3, the RMSE was approximately 20 W/m 2 over humid climate types and greater than 30 W/m 2 over dry climate types. This was consistent with the results obtained for desert land cover. Table 3. Summary of statistics of the four developed models for the different land cover types, climate types, and surface elevations (H). Semi-arid climate (BS); desert climate (BW); temperate climates (Cf); continental climates (Df); and dry winter continental climates (Dw).

Class
No  Similar to the land cover and climate type analysis, the samples were divided into three groups according to the elevation of the sites (H < 500 m, 500 < H < 1000, and H > 1000). The accuracy of the four models at the different heights is presented in Figure 8e,f. The accuracy decreased as elevation increased (Table 3). Thus, surface elevations were incorporated into the developed MARS-NDVI model (named the MARS-NDVI-H model) to improve the model's performance. Both the training and validation results were improved. Figure 9 shows the scatter density of the ground measurements against the MARS-NDVI-H model-predicted LWNR using the same training and validation samples that were used for the MARS-NDVI model, respectively. The determination coefficient increased to 0.51, and the BIAS and RMSE were reduced to 0.01 W/m 2 and 26.10 W/m 2 , respectively. Surface elevations were also incorporated into the developed LM-NDVI model (named the LM-NDVI-H model). The performance of the LM-NDVI-H model was improved, but was poorer than the performance of the MARS-NDVI-H model. According to the above analysis, it was more desirable to develop site-specific LWNR estimation models. However, such models may have lacked representativeness due to limited ground observations. On the other hand, the uncertainties in land cover type, climate classifications, and the adopted thresholds may have resulted in larger errors in the estimated LWNR. Therefore, we preferred to use the unified model for its simplicity and acceptable accuracy at the current stage.
Though the BIAS and RMSE of the developed MARS-NDVI-H model were acceptable, the determination coefficient was relatively low. Primarily, there were two potential error sources: (1) Cloud-base temperature and vertical structure are not considered in GLASS DSR. GLASS DSR is retrieved from optical data using a look-up table method that only considers cloud type and absorption coefficient [32]. Cloud-base temperature and vertical structure can affect the accuracy of the SLDR estimate [44]. (2) The temporal mismatch between satellite data and in situ measurements. The three-hour GLASS DSR was interpolated from the derived instantaneous DSRs using the nearest-neighbor method. The satellite data and ground measurements were also matched using the nearest-neighbor method in this study. The first error source was hard to overcome because the operational method of DSR estimation, which could incorporate cloud base temperature and vertical structure, was unavailable. As a result of an unaffordable memory requirement in the production of GLASS DSR, the retrieved instantaneous DSR was not saved. We can output the instantaneous DSR as an intermediate product in the near future, which would greatly alleviate the effect of the temporal mismatch between satellite data and in situ measurements. Therefore, the second error could be greatly reduced and the determination coefficient of the MARS-NDVI-H model could be improved.

Conclusions
Satellite estimates of cloudy-sky LWNR remain challenging. In this study, the feasibility of estimating LWNR using satellite-derived surface shortwave net radiation was investigated. A large number of spatially and temporally matched samples under a cloudy sky were extracted from ground-based measurements, which were collected at 24 sites from four networks and remote sensing products (GLASS and MODIS data). The extracted samples were divided randomly into two parts, in which two-thirds of the matched samples were used for model development and one-third of the matched samples were used for validation purposes. Linear and MARS models for LWNR estimation were developed and validated using in situ observations. Moreover, the adaptabilities of the above LWNR estimation models among different seasons were investigated. The influence of land cover, climate type, and elevation on these four models was also discussed.
According to the validation results that were obtained from the limited data, the LM model had the poorest performance. The determination coefficient, BIAS, and RMSE were 0.34, 0.02 W/m 2 , and 30.19 W/m 2 , respectively. The nonlinear MARS-NDVI-H model performed the best, with a determination coefficient, BIAS, and RMSE of 0.51, 0.01 W/m 2 , and 26.10 W/m 2 , respectively. This accuracy was comparable to that of a previous study that estimated LWNR using ground-measured surface shortwave net radiation. We are deploying the flux measurement sites in China to acquire a time series of SLUR and SLDR measurements, and are collecting SLUR and SLDR measurements from existing flux measurement networks. We will update the developed models when more ground measurements and instantaneous GLASS DSR become available.
Our group has produced and freely distributed high-quality GLASS DSR, GLASS ABD, and clear-sky LWNR products, and it is promising to produce the corresponding LWNR product with the developed model. Additionally, we can calculate the all-sky surface net radiation from each component of the surface radiation budget.