A Method to Estimate Sunshine Duration Using Cloud Classification Data from a Geostationary Meteorological Satellite (FY-2D) over the Heihe River Basin

Sunshine duration is an important variable that is widely used in atmospheric energy balance studies, analysis of the thermal loadings on buildings, climate research, and the evaluation of agricultural resources. In most cases, it is calculated using an interpolation method based on regional-scale meteorological data from field stations. Accurate values in the field are difficult to obtain without ground measurements. In this paper, a satellite-based method to estimate sunshine duration is introduced and applied over the Heihe River Basin. This method is based on hourly cloud classification product data from the FY-2D geostationary meteorological satellite (FY-2D). A new index—FY-2D cloud type sunshine factor—is proposed, and the Shuffled Complex Evolution Algorithm (SCE-UA) was used to calibrate sunshine factors from different coverage types based on ground measurement data from the Heihe River Basin in 2007. The estimated sunshine duration from the proposed new algorithm was validated with ground observation data for 12 months in 2008, and the spatial distribution was compared with the results of an interpolation method over the Heihe River Basin. The study demonstrates that geostationary satellite data can be used to successfully estimate sunshine duration. Potential applications include climate research, energy balance studies, and global estimations of evapotranspiration.


Introduction
Sunshine duration, also known as the duration of real sunshine, is the time that the sun actually illuminates the Earth's surface and is a measurement index of light resources. Sunshine duration is an important variable that is widely used in studies of atmospheric energy balance [1], analyses of the thermal loads on buildings, climate research, and the evaluation of agricultural resources [2,3] and is used to build estimation models of surface solar radiation [4][5][6]. Therefore, the accurate estimation of sunshine duration is important for researchers working in meteorology, hydrology, and agriculture.
The World Meteorological Organization (WMO) defines sunshine duration as the number of hours for which the direct solar irradiance is above 120 W/m 2 . There are many methods to measure sunshine duration, including direct measurement with sunshine duration recorders, the pyrheliometric method using direct irradiance from a pyrheliometer, and pyranometric algorithms using the global irradiance from a pyranometer [7][8][9]. Although these methods are accurate for measuring the sunshine duration at a station representing a certain area, using them for large regional assessments of sunshine duration is time-consuming and expensive because it requires numerous ground installations, especially when a large spatial coverage and high sampling frequency are desired. Interpolation methods with data from measurements at field meteorological stations are used to obtain the regional sunshine duration; however, the accuracy of interpolation methods is affected by the number and spatial distribution of meteorological stations. In addition, most meteorological stations are near cities, and there are areas without weather stations; as a result, the sunshine duration data from meteorological stations are often inadequate for representing the actual complex climate characteristics of geographic sunshine hours at a regional scale.
To obtain the sunshine duration at a regional scale, various empirical methods have been recommended. Two types of empirical methods are used to estimate sunshine duration in the field. One approach is the sunshine duration percentage method, including the astronomical sunshine percentage and the geographic sunshine percentage. The astronomical sunshine percentage is the ratio of sunshine duration to available sunshine duration, and the geographic sunshine percentage is the ratio of the actual sunshine duration to available sunshine duration. The astronomical sunshine percentage can be calculated using the sun trajectory equation [10]. The geographic sunshine percentage can be calculated by subtracting the amount of terrain masking from the astronomical sunshine percentage [11,12]. The sunshine duration can then be calculated using the Angstrom method based on the assimilation of radiation data [13][14][15][16][17][18][19][20][21], but this empirical method overly relies on the accuracy of the assimilation of radiation data. The other approach is based on clearness index, the amount of clouds, total cloud cover, precipitation, wind speed, air pollution index (API), hourly or shorter time interval cloud-type, and surface incoming direct radiation methods, in which several formalized statistical models have been studied. Empirical equations have been proposed based on the relative sunshine duration and the aforementioned readily available data from ground measurements (meteorological, air quality and radiometric data from various stations) and remote sensing data [7,[22][23][24]. Consequently, this empirical method does not consider the changes in the amount of clouds, air pollution index, and precipitation and can calculate only the monthly sunshine duration, or the cloud type data between sunrise and sunset. Surface incoming direct radiation data must be required at the same time in some empirical methods [24][25][26], and it cannot accurately estimate the daily sunshine duration on the premise that only the cloud type data-not surface incoming direct radiation data-can be used.
With the development of satellite remote sensing technology, it is possible to continuously observe a wide range of cloud cover at a regional scale. In particular, geostationary meteorological satellites can provide information about the types of cloud per hour or shorter time interval, and different cloud types will impact the solar radiation, which will ultimately affect the daily sunshine duration. Therefore, hourly or shorter time interval geostationary meteorological satellite cloud classification data can be used directly to calculate sunshine duration without the need for additional data.
The purpose of our paper is to investigate a method to derive the sunshine duration only from geostationary meteorological satellite cloud classification data, and a new index-the cloud type sunshine factor-is proposed, which can reflect the influence of the hourly cloud type on solar radiation. To accurately estimate the sunshine duration, the Shuffled Complex Evolution Algorithm (SCE-UA) was used to calibrate different cloud types of sunshine factors based on ground measurement data. The estimated sunshine duration values from this proposed new algorithm were validated with independent ground observation data, and the spatial distribution was compared with the results of interpolation methods.

Study Site and Datasets
The Heihe River Basin (~128,900 km 2 ), the second largest inland river basin in the country, is located in the arid northwestern region of China between 97 • 24 -102 • 10 E and 37 • 41 -42 • 42 N. Its elevation ranges from approximately 5000 m in the upper reaches to 1000 m downstream. The river originates in the Qilian Mountains and flows through the Hexi corridor of the province of Gansu from the Yingluo Gorge, through the Zhengyi Gorge, and then northward into the Ejina oasis in the western part of the Inner-Mongolia Plateau before finally discharging into the eastern and western Juyan Lakes ( Figure 1). The landscape varies from glaciers and frozen soil to alpine meadow, forest, irrigated cropland, riparian ecosystem, bare gobi, and desert. The highest air temperatures are approximately 40 • C in downstream areas in the summer, and the lowest fall to approximately −40 • C in the upper watershed in the winter. The mean annual rainfall across the basin is 110.9 mm·yr −1 , and the annual precipitation in the upstream area is more than 350 mm·yr −1 ; it is 100-250 mm·yr −1 in the middle reaches, and the annual precipitation in the downstream area is less than 50 mm·yr −1 .
The high heterogeneity of the underlying surface and the strong seasonal weather changes in the Heihe River Basin can better test the feasibility of the new method proposed in this paper.

Sunshine Duration Observation Data
Data from 14 meteorological stations covering the Heihe River Basin and its surrounding areas were used (Table 1) and distributed in different land use areas of the mountain and plains areas. Each station is equipped with an observation system that records 6 meteorological variables. Diurnal meteorological data include the sunshine duration, air temperature, air pressure, air humidity, wind speed, and rainfall. The meteorological data were provided by the Chinese National Meteorological Bureau, and the sunshine duration data used in the study were from 2007 to 2008 and can be downloaded from http://cdc.cma.gov.cn/cdc_en/home.dd. Quality control of the data was performed by the suppliers.

Sunshine Duration Observation Data
Data from 14 meteorological stations covering the Heihe River Basin and its surrounding areas were used (Table 1) and distributed in different land use areas of the mountain and plains areas. Each station is equipped with an observation system that records 6 meteorological variables. Diurnal meteorological data include the sunshine duration, air temperature, air pressure, air humidity, wind speed, and rainfall. The meteorological data were provided by the Chinese National Meteorological Bureau, and the sunshine duration data used in the study were from 2007 to 2008 and can be downloaded from http://cdc.cma.gov.cn/cdc_en/home.dd. Quality control of the data was performed by the suppliers. Center was used to obtain the cloud classification data in a geographic projection for the target area and the cloud classification coverage type (cloud type) [27], ranging from 1 to 7 and in the order of "Number" in Table 2. These data were generated by the FengYun-2D (FY-2D) satellite, the second operational vehicle of the first-generation geostationary  The influence of different cloud types on solar radiation is mainly reflected in the fact that the diurnal variation of cloud types leads to a variation of hourly illumination intensity, which then affects the daily sunshine duration. Therefore, we propose a new index-the cloud type sunshine factor (SF)-to characterize the magnitude of the effects of different cloud types of Fengyun (FY) geostationary meteorological satellite on the sunshine duration per hour over the Heihe River Basin. This new sunshine factor was combined with the hourly cloud classification data between sunrise and sunset to estimate the sunshine duration (Equation (1)). The sunrise time and sunset time over the  (1) where FY sunt is the sunshine duration (between 15 min (+0.25 h) after the start of sunrise and 15 min (−0.25 h) before sunset and the accumulation of sunshine factors); SF is the FY-2D cloud type sunshine factor, which is the index for the hourly FY-2D hourly cloud type data from sunrise to sunset; T gap is an hour's interval with a value of 1; h sr and h ss are the times of sunrise and sunset, respectively; i is a time series that ranges between sunrise and sunset at the local time.

FY-2D Cloud Type Sunshine Factor
Under the condition that the correlation coefficient is greater than or equal to 0.9, based on Equation (1), the FY-2D cloud type sunshine factors (FY-2D-SF) at each meteorological station were fitted and combined with the hourly FY-2D cloud type data from sunrise to sunset and the measured sunshine duration data of each meteorological station at the different dates over the Heihe River Basin in 2007. Table 3 shows the range of the FY-2D-SF, which indicates the range of cloud type sunshine factors present high fluctuations at the regional scale. According to the data, the FY-2D-SF of CLS was the highest in 2007, from 0.78 to 0.98. The second highest was the CIS, from 0.45 to 0.58. The third was STA, from 0.30 to 0.41. MIP, ALN, and CIRS were relatively equal. The lowest FY-2D-SF appeared in the CUC.

FY-2D Cloud Type Sunshine Factor Estimation
In the previous section, the different cloud types corresponding to the sunshine factor range over the different meteorological stations were established. Next, we used an optimization algorithm to calibrate different cloud type sunshine factors at the regional scale. In this paper, the Shuffled Complex Evolution Algorithm (SCE-UA) based on the simple algorithm was used to optimize the FY-2D cloud typed sunshine factor over the Heihe River Basin for 2007, which is an effective method for solving nonlinear constrained optimization problems and can be used to find a global optimal solution. The algorithm was originally applied to a hydrological model, and we applied it to the optimization of FY-2D cloud type sunshine factors in this paper [28].

Model Performance Assessment
The performances of the proposed methods for estimating sunshine duration were assessed based on widely used goodness-of-fit statistics [29], including the coefficient of determination (R 2 ), Sensors 2016, 16, 1859 6 of 10 mean absolute error (MAE), root mean square error (RMSE), and index of agreement (d). These parameters are defined as follows: In the above equations, Q i is the actual measurement, P i is its estimate, O is the mean measurement, P is the mean of the estimates, and n is the sample size. Colaizzi and Liu suggest Equation (1) that a model performs well when the MAE is less than 50% of the measured standard deviation, Equation (2) that there are few outliers when the RMSE is not greater than 50% of the MAE, and Equation (3) that the higher the value of d, the better the model performance [30,31]. Table 4 shows the final FY-2D cloud type sunshine factor over the Heihe River Basin after the application of the optimization algorithm. The sunshine factors of CLS, MIP, ALN, CIS, CIRS, CUC, and STA were 0.9, 0.21, 0.25, 0.51, 0.24, 0.13, and 0.35, respectively. Based on these sunshine factor results, the daily sunshine duration values were calculated over the Heihe River Basin in 2008, and Table 5 and Figure 2 show the validation results of the estimation of the daily sunshine duration and ground measurement sunshine duration data at each station in 2008. For the whole Heihe River Basin, the coefficients of determination (R 2 ) were greater than 0.89, except for three stations in the mountains (Tuole, Yeniugou, and Qilian). There was therefore a strong correlation between the actual sunshine duration and the sunshine duration estimated using the new method. The lower coefficients of determination in the mountain stations (Tuole, Yeniugou, and Qilian) are mainly due to the effects of topography; where topographic effects reduce the values of direct solar irradiance to below 120 W/m 2 , there will be no registration of the sunshine duration. The d values of all stations were greater than 0.990, again suggesting good performance. The difference between the RMSE and MAE was less than 40% of the MAE, meaning there are few outliers in the estimated ET values, according to Colaizzi.    Figure 3 shows that the spatially distributed sunshine duration on the left map was obtained via IDW spatial interpolation (inverse distance weighting with exponent 2), and the middle map was obtained by the Kriging method (ordinary and exponential). At the same time, the different methods (ordinary or universal) were chose and different semivariogram models (e.g., spherical, circular, Gaussian, and linear) were used in the Kriging interpolation. The results from the Kriging interpolation was used as baseline data for comparison among the IDW method results and the  Figure 3 shows that the spatially distributed sunshine duration on the left map was obtained via IDW spatial interpolation (inverse distance weighting with exponent 2), and the middle map was obtained by the Kriging method (ordinary and exponential). At the same time, the different methods (ordinary or universal) were chose and different semivariogram models (e.g., spherical, circular, Gaussian, and linear) were used in the Kriging interpolation. The results from the Kriging interpolation was used as baseline data for comparison among the IDW method results and the proposed method results. Thus, the comparison results of the three methods show that it was large differences for the spatial distribution pattern.

Results
Because the ground measurements are sparsely distributed in the Heihe River Basin, especially in the downstream regions, regardless of which interpolation method was used for the interpolation of sunshine duration, the result from different interpolation methods and the method proposed in Sensors 2016, 16, 1859 8 of 10 this paper differs in spatial distribution. Moreover, Figure 2 shows that the method proposed here has a higher accuracy in the Heihe River Basin compared with the measured sunshine duration from different stations; therefore, it is clear that the proposed method is better able to show the spatial distribution variation of the sunshine duration.
proposed method results. Thus, the comparison results of the three methods show that it was large differences for the spatial distribution pattern.
Because the ground measurements are sparsely distributed in the Heihe River Basin, especially in the downstream regions, regardless of which interpolation method was used for the interpolation of sunshine duration, the result from different interpolation methods and the method proposed in this paper differs in spatial distribution. Moreover, Figure 2 shows that the method proposed here has a higher accuracy in the Heihe River Basin compared with the measured sunshine duration from different stations; therefore, it is clear that the proposed method is better able to show the spatial distribution variation of the sunshine duration.

Discussion
In this paper, we demonstrate a method to derive the sunshine duration using only the cloud classification data from a geostationary meteorological satellite (FY-2D) without depending on continuous ground measurements. This method uses cloud classification data to calibrate the FY-2D cloud type sunshine factor and is combined with the sunrise and sunset to estimate the daily sunshine duration. High correlations were obtained between the estimated and measured in situ sunshine duration at different ground measurement stations, and the method accurately portrays the variation in the spatial distribution of the sunshine duration over the Heihe River Basin. This new method has the potential to estimate the diurnal variation of surface net radiation and evapotranspiration over large regions.
Remotely sensed data describe the pixel sunshine duration. For cloud classification data from FY-2D, the pixel value represents the average sunshine duration for a 5 km × 5 km area. In addition, some of the smaller clouds in the cloud classification data have not been accounted for, which can affect the sunshine duration estimated by the method proposed in this paper. Ground measurements of sunshine duration represent a much smaller area, which does not match with the area covered by a pixel. Consequently, some outliers can be expected in Table 5 and Figure 2, in which the estimated values are compared with the measured values. However, most meteorological stations in the plain areas are near cities, and the terrain around them is relatively flat and has a relatively homogeneous surface coverage. Thus, the ground measurement values in the plains areas can be assumed to represent the average value for the surrounding region and can be appropriately matched with remote sensing pixel values. In contrast, in the mountains, due to the effects of topography and the heterogeneity of the mountain surface, some areas have low direct solar irradiance, which leads to a lower correlation between the actual sunshine duration and the estimated sunshine duration, as shown in the results in the Tuole, Yeniugou, and Qilian stations.
The proposed method is an empirical method, and the sunshine factors of different cloud types in the Heihe River Basin were calculated from meteorological station data. The sunshine factors

Discussion
In this paper, we demonstrate a method to derive the sunshine duration using only the cloud classification data from a geostationary meteorological satellite (FY-2D) without depending on continuous ground measurements. This method uses cloud classification data to calibrate the FY-2D cloud type sunshine factor and is combined with the sunrise and sunset to estimate the daily sunshine duration. High correlations were obtained between the estimated and measured in situ sunshine duration at different ground measurement stations, and the method accurately portrays the variation in the spatial distribution of the sunshine duration over the Heihe River Basin. This new method has the potential to estimate the diurnal variation of surface net radiation and evapotranspiration over large regions.
Remotely sensed data describe the pixel sunshine duration. For cloud classification data from FY-2D, the pixel value represents the average sunshine duration for a 5 km × 5 km area. In addition, some of the smaller clouds in the cloud classification data have not been accounted for, which can affect the sunshine duration estimated by the method proposed in this paper. Ground measurements of sunshine duration represent a much smaller area, which does not match with the area covered by a pixel. Consequently, some outliers can be expected in Table 5 and Figure 2, in which the estimated values are compared with the measured values. However, most meteorological stations in the plain areas are near cities, and the terrain around them is relatively flat and has a relatively homogeneous surface coverage. Thus, the ground measurement values in the plains areas can be assumed to represent the average value for the surrounding region and can be appropriately matched with remote sensing pixel values. In contrast, in the mountains, due to the effects of topography and the heterogeneity of the mountain surface, some areas have low direct solar irradiance, which leads to a lower correlation between the actual sunshine duration and the estimated sunshine duration, as shown in the results in the Tuole, Yeniugou, and Qilian stations.
The proposed method is an empirical method, and the sunshine factors of different cloud types in the Heihe River Basin were calculated from meteorological station data. The sunshine factors must be calibrated using the local ground measurements of sunshine duration if the proposed method is applied in other regions.
There is a ceiling value of the sunshine duration at a specific geographical latitude independent of the cloud type. For example, the sunshine duration is always zero in the polar night in the winter in the Arctic. The sunshine hours will also be affected by the topography. In addition, different cloud amounts have an impact on the sunshine factor even with the same cloud types. Further study will be carried out to take these factors into account, and the results may improve the quality of operational sunshine duration data.

Conclusions
A method was developed to estimate sunshine duration using cloud classification data from a geostationary meteorological satellite. The method was calibrated and validated in the Heihe River Basin of China. The method uses hourly cloud type data from the Fengyun-2D geostationary meteorological cloud products and data from the field measurements of sunshine duration to estimate a new index-the FY-2D cloud type sunshine factor. Using a wide range of meteorological stations within the Heihe River Basin in 2007, the Shuffled Complex Evolution Algorithm (SCE-UA) was used to optimize the FY-2D sunshine factor over the Heihe River Basin. This new sunshine factor combined with the hourly cloud classification data between sunrise and sunset can be used to estimate sunshine duration, and high correlations were obtained between the estimated and measured sunshine hours at different stations in 2008, with coefficients of determination (R 2 ) greater than 0.89, except for three stations in the mountains. The spatial distribution analysis of the sunshine duration showed that its spatial distribution was better represented by the new method than by a method based on interpolation between meteorological stations.