Validation of CM SAF CLARA-A2 and SARAH-E Surface Solar Radiation Datasets over China

: To achieve high-quality surface solar radiation (SSR) data for climate monitoring and analysis, the two satellite-derived monthly SSR datasets of CM SAF CLARA-A2 and SARAH-E have been validated against a homogenized ground-based dataset covering 59 stations across China for 1993–2015 and 1999–2015, respectively. The satellite products overestimate surface solar irradiance by 10.0 W m − 2 in CLARA-A2 and 7.5 W m − 2 in SARAH-E on average. A strong urbanization effect has been noted behind the large positive bias in China. The bias decreased after 2004, possibly linked to a weakened attenuating effect of aerosols on radiation in China. Both satellite datasets can reproduce the monthly anomalies of SSR, indicated by a signiﬁcant correlation around 0.8. Due to the neglection of temporal aerosol variability in the satellite algorithms, the discrepancy between the satellite-estimated and ground-observed SSR trends slightly increases in 1999–2015 as compared to 1993–2015. The seasonal performance of the satellite products shows a better accuracy during warm than cold seasons. With respect to the spatial performance, the effects from anthropogenic aerosols, dust aerosols and high elevation and snow-covered surfaces should be well considered in the satellite SSR retrievals to further improve the performance in the eastern, northwestern and southwestern parts of China, respectively.


Introduction
High-quality surface solar radiation (SSR) data is highly demanded to meet the growing needs for solar energy applications [1], the accurate estimation of the radiation budget [2,3], hydrological processes [4,5] and carbon cycle [6,7]. The basic data source is from ground observations, which show a worldwide decrease in SSR since the 1950s (termed "global dimming") followed by an increase from the late 1980s ("global brightening") [8]. Uncertainties, however, still exist in the observed global dimming and brightening phenomena, due to the limitations in the surface-based measurements [9]. Besides possible instrumental and operational issues, ground observations are sparse and not evenly distributed, especially over oceans, remote or sparsely populated areas and mountainous regions with complex terrain [10,11]. Correspondingly, an inhomogeneity issue in the ground-observed SSR trend has been noted in China, especially during the 1990-1993 period and in the Tibet region with high elevations [12]. For the whole of China covering over 9.6 million km 2 , there are only 130 solar radiation stations, only 59 of them with records covering more than 75% of the measurement period from 1957 to the present. In addition, around three quarters of the stations are located in urban areas, indicating  [29] Besides the strong influences from anthropogenic aerosols, China's climate and topography is highly complex and varies significantly from region to region. Accordingly, large biases in previous satellite estimates of SSR in the GEWEX-SRB and ISCCP-FD products have been reported in China [28,29]. The largest difference between ground-observed and satellite-derived SSR has been found over the rapidly developing regions of South China [30,31], as well as the highly variable terrain of the Tibetan Plateau [32,33]. However, two issues remain in the previous evaluations. First of all, studies are done mainly for short-time periods, which limit the validation on the temporal stability of the satellite products. Secondly, there was not much consideration of inhomogeneity issues in the surface observations. Therefore, this study attempts to validate the accuracy and the stability of two SSR datasets provided by the CM SAF, namely CLARA-A2, based on polar-orbiting satellites and SARAH-E, based on geostationary satellites, against a homogenized dataset of surface observations over China. CLARA-A2 is the acronym of "CM SAF cLoud, Albedo and surface RAdiation dataset from AVHRR data-Edition 2," while SARAH-E is short for "Surface Solar Radiation Data Set-Heliosat, Meteosat-East (SARAH-E)-Edition 1." The validation of both CLARA-A2 and SARAH-E datasets allows for a comparison of the performance between products derived from polar-orbiting and geostationary satellites. Detailed information on the datasets used in this study is presented in Section 2. Quality control of the ground observations and the validation metrics and procedure are explained in Section 3. The validation results with respect to data availability, accuracy and stability of the satellite products are shown and discussed in Section 4. Finally, the main conclusions are drawn in Section 5.

Datasets
Product user manuals and related reports regarding the two climate data records of CLARA-A2 and SARAH-E are available from the official website of the CM SAF at http://www.cmsaf.eu/. Here, only the information relevant to the validation purposes is further illustrated. The ground-observed SSR dataset from the China Meteorological Administration (CMA) is used as the reference. Satellite data are extracted from the same location as the station data to do the comparisons. Depending on the availability and homogeneity of all the datasets, the validations of CLARA-A2 and SARAH-E SSR against the surface reference measurements are applied on monthly means for the periods of 1993-2015 and 1999-2015, respectively. The spatial coverage and climatology of multi-year averaged SSR for the datasets analyzed are represented in Figure 1. Details of the datasets are explained in the following text.

Datasets
Product user manuals and related reports regarding the two climate data records of CLARA-A2 and SARAH-E are available from the official website of the CM SAF at http://www.cmsaf.eu/. Here, only the information relevant to the validation purposes is further illustrated. The ground-observed SSR dataset from the China Meteorological Administration (CMA) is used as the reference. Satellite data are extracted from the same location as the station data to do the comparisons. Depending on the availability and homogeneity of all the datasets, the validations of CLARA-A2 and SARAH-E SSR against the surface reference measurements are applied on monthly means for the periods of 1993-2015 and 1999-2015, respectively. The spatial coverage and climatology of multi-year averaged SSR for the datasets analyzed are represented in Figure 1. Details of the datasets are explained in the following text.

The CLARA-A2 Dataset
Derived from the Advanced Very High Resolution Radiometer (AVHRR) sensors onboard the polar orbiting NOAA and METOP satellites, CLARA-A2 provides global information on cloud fraction and their properties, surface radiation and the surface albedo from 1982 to 2015 with a spatial resolution of 0.25° × 0.25° [24,34]; the variable of surface incoming shortwave radiation, that is, surface solar radiation (SSR), from the CLARA-A2 climate data record is provided as daily and monthly means (Table 1). Monthly SSR means of the CLARA-A2 dataset were validated in this study for the period of 1993-2015. The measurement periods and orbits for all satellites covered by CLARA-A2 are illustrated in Figure 1 of Karlsson et al. [24]. A deficiency influencing the data quality of CLARA-A2 SSR is the availability of only one satellite in orbit providing measurements during 1982-1991 [10]. Therefore, this period has been excluded from this study, to avoid the effect from a large number of missing data on the validation results. The retrieval of CLARA-A2 SSR is based on a look-up-table approach that relates, in cloudy situations, the reflected fluxes at the top of the atmosphere to the surface fluxes. For clear-sky situations, the mesoscale atmospheric global irradiance code (MAGIC, http://gnu-magic.sourceforge.net/) algorithm is used to derive the clear-sky surface irradiance. In both cases, auxiliary data are used, namely a surface albedo map, monthly-averaged integrated water vapor and an aerosol climatology. Detailed information on the retrieval scheme can be found in Mueller et al. [35] and Karlsson et al. [36]. Data in areas where the surface albedo map used in the satellite retrieval differs by more than 35% from the actual surface albedo as derived from the CLARA-A2 SAL data record are set to missing due to the degraded data quality under these conditions (mainly snow-covered surfaces).

The CLARA-A2 Dataset
Derived from the Advanced Very High Resolution Radiometer (AVHRR) sensors onboard the polar orbiting NOAA and METOP satellites, CLARA-A2 provides global information on cloud fraction and their properties, surface radiation and the surface albedo from 1982 to 2015 with a spatial resolution of 0.25 • × 0.25 • [24,34]; the variable of surface incoming shortwave radiation, that is, surface solar radiation (SSR), from the CLARA-A2 climate data record is provided as daily and monthly means (Table 1). Monthly SSR means of the CLARA-A2 dataset were validated in this study for the period of 1993-2015. The measurement periods and orbits for all satellites covered by CLARA-A2 are illustrated in Figure 1 of Karlsson et al. [24]. A deficiency influencing the data quality of CLARA-A2 SSR is the availability of only one satellite in orbit providing measurements during 1982-1991 [10]. Therefore, this period has been excluded from this study, to avoid the effect from a large number of missing data on the validation results. The retrieval of CLARA-A2 SSR is based on a look-up-table approach that relates, in cloudy situations, the reflected fluxes at the top of the atmosphere to the surface fluxes. For clear-sky situations, the mesoscale atmospheric global irradiance code (MAGIC, http://gnu-magic.sourceforge.net/) algorithm is used to derive the clear-sky surface irradiance. In both cases, auxiliary data are used, namely a surface albedo map, monthly-averaged integrated water vapor and an aerosol climatology. Detailed information on the retrieval scheme can be found in Mueller et al. [35] and Karlsson et al. [36]. Data in areas where the surface albedo map used in the satellite retrieval differs by more than 35% from the actual surface albedo as derived from the CLARA-A2 SAL data record are set to missing due to the degraded data quality under these conditions (mainly snow-covered surfaces).

The SARAH-E Dataset
SARAH-E is derived from the visible channel of the Meteosat Visible Infra-Red Imager (MVIRI) instruments onboard the geostationary Meteosat IODC (Indian Ocean Data Coverage) satellites [37]. Compared to the CLARA-A2 product, SARAH-E covers a limited region of 8 • W to 128 • E longitude and ±65 • latitude viewed by the geostationary Meteosat satellite for a shorter period of 1999-2016 but is of higher (up to hourly) temporal resolution and spatial resolution of 0.05 • × 0.05 • ( Table 1). The operation periods and positions of the relevant satellites Meteosat-5 and 7 are listed in the Table 1 of Gracia Amillo et al. [38]. The SPECMAGIC eigenvector-hybrid Look-Up Table (LUT) approach is used for the generation of SARAH-E SSR [16]. It is noteworthy that, similar with CLARA-A2, aerosol information is included as monthly climatological values for the SARAH-E SSR retrieval [17]. In other words, effects from interannual and sub-monthly variations of aerosols are not considered in the retrieved CLARA-A2 and SARAH-E SSR. Higher uncertainties in the calculation of SSR over bright surfaces (e.g., desert regions) due to the reduced contrast between clouds and the surface are also mentioned in the SARAH-E's product user manual [37].

The CMA Dataset
The main data source for ground observations of SSR over China is from the China Meteorological Data Service Center (CMDC, http://data.cma.cn) governed by the CMA. The measurements started in 1957 but underwent a nationwide reorganization in both instruments and stations during 1990-1993 [39,40]. Consequently, an abnormal increase has been noted in the SSR trend of China over this period [12]. To avoid this inhomogeneity, only the monthly surface data for the period after 1993 with consistent measurements are used as reference in this study. During the studied period, the Chinese-developed thermopile pyranometers have been used for the surface solar irradiance (in W m −2 ) measurements with an uncertainty of 3.4% (~5 W m −2 ) [41]. Every other year, all operating instruments were calibrated against the national reference groups of China, which in turn were calibrated by references at the World Radiation Center (WRC) every 5 years [39]. A basic data quality control has been performed by the CMA, which, however, is not sufficient enough to cover all the errors [42]. Therefore, an additional homogenization of the surface observations has been applied, with the specific procedure shown in the next section.

Data Quality Control
The finally selected 59 CMA stations with homogenous SSR observations during 1993-2015 are shown in Figure 2, which cover 18 • 13 N-50 • 15 N latitude and 80 • 14 E-129 • 30 E longitude with elevations varying from 3 to 4507 m. The homogenization process includes three steps. In the first step, 92 out of the total 130 CMA stations with long-term records covering more than 95% of the studied period 1993-2015 were picked out. In the second step, to further eliminate the outliers, a physical threshold test was applied on the monthly CMA SSR following the two criteria raised by Shi et al. [42] and Tang et al. [43]: (1) 0.03R a < SSR < R a and (2) SSR < 1.1R so , where R a and R so are extraterrestrial radiation and clear-sky radiation, respectively, calculated based on the FAO-56 method [44]. With this detection, 41 extreme monthly values (~0.15% of total) that exceed the physical threshold were further deleted. In the third step, the Standard Normal Homogeneity Test (SNHT, [45]) was used to detect the monthly timeseries of CMA SSR including inhomogeneities, with sunshine duration (SD, the most widely used proxy for SSR, collected as monthly records from the CMDC) and CLARA-A2 SSR as the references for comparisons. The SNHT derives a statistic T(k) for each month k ∈ (1, 2, . . . , n), with n being the total number of months (i.e., 276 used in this study): where z 1 and z 2 are the averages of the X i values before and after a possible break point, calculated based on the mean X and standard deviation σ of the whole timeseries {X i }: Possible break/shifts in the timeseries can then be detected in the months when T(k) exceeds the value of a certain critical level, that is, 9.966 for the critical level of 95% used in this study, as acquired from the Table I of Khaliq and Quarda [46]. The SNHT was applied on the timeseries of CMA SSR, SD and CLARA-A2 SSR separately (known as "absolute" SNHT), as well as the relative bias series of CMA SSR minus SD/CLARA-A2 SSR (known as "relative" SNHT). Monthly anomalies from the 1993-2015 monthly means instead of absolute values were used here to reduce possible effects from the annual solar cycle. SARAH-E was not included as a reference series because of its shorter observational period of 1999-2015 than the homogenized period for the surface data. After this, 33 CMA stations were further excluded from this study, where significant breaks were detected in the series of relative biases by the SNHT not owing to inhomogeneity in the reference series or changes in non-climatic factors (e.g., aerosols).
Possible break/shifts in the timeseries can then be detected in the months when exceeds the value of a certain critical level, that is, 9.966 for the critical level of 95% used in this study, as acquired from the Table I of Khaliq and Quarda [46]. The SNHT was applied on the timeseries of CMA SSR, SD and CLARA-A2 SSR separately (known as "absolute" SNHT), as well as the relative bias series of CMA SSR minus SD/CLARA-A2 SSR (known as "relative" SNHT). Monthly anomalies from the 1993-2015 monthly means instead of absolute values were used here to reduce possible effects from the annual solar cycle. SARAH-E was not included as a reference series because of its shorter observational period of 1999-2015 than the homogenized period for the surface data. After this, 33 CMA stations were further excluded from this study, where significant breaks were detected in the series of relative biases by the SNHT not owing to inhomogeneity in the reference series or changes in non-climatic factors (e.g., aerosols).

Validation Metrics and Procedure
To quantify the performance of the CM SAF CLARA-A2 and SARAH-E SSR products, two sets of metrics were chosen for the accuracy and stability tests, respectively.
For the accuracy test, the basic metrics of mean bias deviation (MBD), mean absolute bias

Validation Metrics and Procedure
To quantify the performance of the CM SAF CLARA-A2 and SARAH-E SSR products, two sets of metrics were chosen for the accuracy and stability tests, respectively.
For the accuracy test, the basic metrics of mean bias deviation (MBD), mean absolute bias deviation (MABD) and root mean square deviation (RMSD) have been used to remain consistent with previous assessments of CM SAF products over other regions of the world [17,25,27]. Setting the variable x as the satellite-derived SSR record to be validated, y as the surface reference measurement, n as the number of months, the metrics are interpreted as follows: Mean Bias Deviation (MBD): The mean difference between the compared datasets, indicating an average over-or underestimation of the satellite-derived dataset compared to the reference dataset.
Mean Absolute Bias Deviation (MABD): The mean of the absolute differences between the compared datasets. MABD of 15 W m −2 is the threshold for accuracy applied in this study, combining 10 W m −2 for the target accuracy defined in the CM SAF CDOP Product Requirements Document [15] and 5 W m −2 for the uncertainty in pyranometer measurements over China [41].
Root Mean Square Deviation (RMSD): The sample standard deviation of the differences between satellite-derived and ground-observed values. Compared to MABD, RMSD is more sensitive to outliers [47].
The accuracy of the satellite products was tested at both temporal (annual and seasonal) and spatial scales. The seasons are defined as spring (MAM, March to May), summer (JJA, June to August), autumn (SON, September to November) and winter (DJF, December to February).
For the stability test, at the first step, the annual variations of the three metrics of MBD, MABD and RMSD for both CLARA-A2 and SARAH-E SSR products were analyzed for the periods of 1993-2015 and 1999-2015, respectively. In addition, the monthly SSR anomalies from the 1993-2015 and 1999-2015 means were derived respectively for CLARA-A2 and SARAH-E and their corresponding surface reference CMA data, to exclude the effect from the annual cycle of SSR. After this, the anomaly correlations between the CM SAF and CMA radiation datasets were calculated. The linear trends in the monthly SSR anomalies of CM SAF and CMA datasets were compared to evaluate the agreements.

Availability Test
To begin with, the data availability of the CLARA-A2 and SARAH-E products over China has been checked. An indication can then be given for the performance comparison between polar-orbiting and geostationary satellites.
As shown in Figure 3, polar-orbiting satellite-based CLARA-A2 has a complete spatial coverage over China. The availability of monthly CLARA-A2 records generally increases with increasing latitude. This is due to a skewed distribution of observations by the polar sun-synchronous satellites with higher frequency near the poles than the equator [24]. Data availability fluctuates significantly during the period of 1993-2001, when only two simultaneous satellites provided measurements in the morning and afternoon orbits at local time about 7:00 and 15:00, respectively [10]. No data are available from November 1994 to January 1995 and from October to December 2000, when only one AVHRR instrument has been providing data. After a third morning satellite started observations at about 10:00 local time since 2002 [24], the availability of CLARA-A2 records has become quite stable and almost 100% complete.  In comparison, the geostationary satellite-based SARAH-E dataset has a full availability over all the stations within the spatial coverage of the Meteosat disk except the Northeastern region of China. The spatial coverage of the SARAH-E product over China further shrinks to the west after 2007, when the Meteosat East observing location over the Indian Ocean shifted from 63°E to 57°E [38]. This is also indicated in the time evolution of the availability ratio of the SARAH-E product (Figure 3 lower right). There are three stages of no availability during 1993-1998, 83% availability during 1999-2006 and then a decrease to 69% availability during 2007-2015 over the 59 stations across China.
In brief, the polar-orbiting CLARA-A2 SSR product has a better spatial coverage and longer observational period over China; while the geostationary-based SARAH-E SSR data are more consistent, indicative of higher spatial and temporal resolutions. In the following analysis, only records available in both CM SAF and CMA datasets are considered, in order to ensure the same number of months for comparisons.

Accuracy Test
Both CLARA-A2 and SARAH-E have been previously validated against globally distributed observation sites from the Baseline Surface Radiation Network (BSRN) Archive, which includes one station of Xianghe, located in China (http://bsrn.awi.de/) [48]. With reference to global BSRN data on a monthly mean basis, CLARA-A2 shows a high accuracy with MBD of −1.6 W m −2 , MABD of 8.8 W m −2 and SD (Standard deviation) of 13.1 W m −2 [24]; comparably, SARAH-E has a MBD of −1.7 W m −2 , a MABD of 7.9 W m −2 and a SD of 10.4 W m −2 [37], when compared to the BSRN stations covered by the SARAH-E data record. Table 2 summarizes the averaged metrics over China in comparison to the homogenized CMA surface observations for the common periods. Although a negative bias has been observed on the global scale compared to the BSRN sites, as well as in Europe compared to the GEBA dataset and a variety of surface stations from European weather services [10,17], CLARA-A2 overestimates SSR in China with a positive MBD of 10.0 W m −2 . Equally, the MABD fails to fulfill the accuracy threshold of 15 W m −2 . Nevertheless, the performance of CLARA-A2 is comparable to other products over China, In comparison, the geostationary satellite-based SARAH-E dataset has a full availability over all the stations within the spatial coverage of the Meteosat disk except the Northeastern region of China. The spatial coverage of the SARAH-E product over China further shrinks to the west after 2007, when the Meteosat East observing location over the Indian Ocean shifted from 63 • E to 57 • E [38]. This is also indicated in the time evolution of the availability ratio of the SARAH-E product (Figure 3 lower right). There are three stages of no availability during 1993-1998, 83% availability during 1999-2006 and then a decrease to 69% availability during 2007-2015 over the 59 stations across China.
In brief, the polar-orbiting CLARA-A2 SSR product has a better spatial coverage and longer observational period over China; while the geostationary-based SARAH-E SSR data are more consistent, indicative of higher spatial and temporal resolutions. In the following analysis, only records available in both CM SAF and CMA datasets are considered, in order to ensure the same number of months for comparisons.

Accuracy Test
Both CLARA-A2 and SARAH-E have been previously validated against globally distributed observation sites from the Baseline Surface Radiation Network (BSRN) Archive, which includes one station of Xianghe, located in China (http://bsrn.awi.de/) [48]. With reference to global BSRN data on a monthly mean basis, CLARA-A2 shows a high accuracy with MBD of −1.6 W m −2 , MABD of 8.8 W m −2 and SD (Standard deviation) of 13.1 W m −2 [24]; comparably, SARAH-E has a MBD of −1.7 W m −2 , a MABD of 7.9 W m −2 and a SD of 10.4 W m −2 [37], when compared to the BSRN stations covered by the SARAH-E data record. Table 2 summarizes the averaged metrics over China in comparison to the homogenized CMA surface observations for the common periods. Although a negative bias has been observed on the global scale compared to the BSRN sites, as well as in Europe compared to the GEBA dataset and a variety of surface stations from European weather services [10,17], CLARA-A2 overestimates SSR in China with a positive MBD of 10.0 W m −2 . Equally, the MABD fails to fulfill the accuracy threshold of 15 W m −2 . Nevertheless, the performance of CLARA-A2 is comparable to other products over China, as large positive MBD have also been reported for the GEWEX-SRB by 8.5~14.6 W m −2 , ISCCP-FD by 16.4~18.3 W m −2 , CERES-EBAF by 8.1 W m −2 and UMD-SRB by 14.2 W m −2 comparing to the CMA observations [28,29,39,49]. The systematic overestimation by satellite algorithms in China was suggested to be most likely associated with aerosols and their complex interactions with clouds [30]. Compared to CLARA-A2, SARAH-E shows a slightly higher accuracy of 7.5 W m −2 MBD, 15.1 W m −2 MABD and 18.7 W m −2 RMSD over China (Table 2), which might be related to its higher spatial and temporal resolutions. The different retrieval schemes used for CLARA-A2 (based on AVHRR sensors) and SARAH-E (based on MVIRI instruments) might also have an impact. To test possible random effects from samplings, we recalculated the metrics for CLARA-A2 by limiting its spatial and temporal coverage to the same as SARAH-E. After this, the results show even higher MBD of 12.5 W m −2 , MABD of 18.2 W m −2 and RMSD of 21.4 W m −2 for CLARA-A2. Therefore, the higher accuracy in SSR estimates of SARAH-E than CLARA-A2 in China is not a chance result of the different spatial and temporal coverages between the two satellite products. SARAH-E estimates over China also show better performance as compared to India, where an almost twice as large bias has been reported [27]. The annual cycle of SSR can be well tracked in both satellite products of CLARA-A2 and SARAH-E, reaching the maximum in summer with the means over 200 W m −2 and the minimum in winter with the means under 120 W m −2 (Figure 4). However, a generally opposite annual cycle can be observed in the relative terms of MBD, MABD and RMSD for both datasets, with the largest relative deviations occurring in the winter season. A similar pattern has been noted in Europe and attributed to an inaccuracy in SSR retrievals over snow-covered surfaces, a degraded retrieval quality under low sun and large viewing angle conditions in the northern hemisphere during the wintertime as well as low absolute levels of SSR [25]. In addition, the large relative bias during the winter season in China might also be partially introduced by the neglect of sub-monthly aerosol variability in the satellite retrievals, considering winter as the season with the highest aerosol concentrations in China due to domestic heating and stagnant weather [50]. The low absolute MABDs during the wintertime simply reflects the low insolation [14]. Similarly, the relatively high insolation during the spring season could partially explain the highest discrepancy indicated in the absolute values of the metrics (Table 2), while in relative terms, spring still shows the second largest discrepancy. Spring is featured by frequent dust storm events in Northwest and North China [51] and extremely high humidity events with low visibility in Southern China [52]. These factors would increase the difficulty in achieving accurate satellite retrievals of SSR in the spring season in China. The best performance of satellite estimates is in summer when rainfall scavenging of air pollutants is most efficient [53]. The bias slightly increases in autumn, with increasing biomass burning during the harvest season [50]. The seasonal performance of the satellite products is generally in good agreement with surface SSR observations; remaining differences might be related to the seasonal cycle of aerosol loadings in China.
Remote Sens. 2018, 10, x FOR PEER REVIEW 9 of 19 bias slightly increases in autumn, with increasing biomass burning during the harvest season [50]. The seasonal performance of the satellite products is generally in good agreement with surface SSR observations; remaining differences might be related to the seasonal cycle of aerosol loadings in China. The spatial variability of SSR in China can be accurately described by both CLARA-A2 and SARAH-E SSR products (Figure 1). At regional scale, there is a general overestimation of SSR in the satellite estimates over the eastern part of China, with positive biases in a range of 2.4~33.0 W m −2 for CLARA-A2 and 1.1~26.1 W m −2 for SARAH-E ( Figure 5). In the western part, negative biases are prevalent with magnitudes ranging from −17.2~−0.1 W m −2 for CLARA-A2 and −14.6~−2.2 W m −2 for SARAH-E. A significantly positive correlation between MBD and longitude is shown in Figure 6b. Latitude only has a significantly negative correlation with the MBD of CLARA-A2, possibly with the influence of data availability included (Figure 3). Similar spatial patterns have been noted in the biases of the GEWEX-SRB [28,29], the ISCCP-FD [49] and the UMD-SRB dataset [30] compared to the CMA surface measurements. The spatial variability of SSR in China can be accurately described by both CLARA-A2 and SARAH-E SSR products (Figure 1). At regional scale, there is a general overestimation of SSR in the satellite estimates over the eastern part of China, with positive biases in a range of 2.4~33.0 W m −2 for CLARA-A2 and 1.1~26.1 W m −2 for SARAH-E ( Figure 5). In the western part, negative biases are prevalent with magnitudes ranging from −17.2~−0.1 W m −2 for CLARA-A2 and −14.6~−2.2 W m −2 for SARAH-E. A significantly positive correlation between MBD and longitude is shown in Figure 6b. Latitude only has a significantly negative correlation with the MBD of CLARA-A2, possibly with the influence of data availability included (Figure 3). Similar spatial patterns have been noted in the biases of the GEWEX-SRB [28,29], the ISCCP-FD [49] and the UMD-SRB dataset [30] compared to the CMA surface measurements.  Another potential reason underlying the relatively larger biases in China than the global average is the fact that most of the solar radiation stations in China are located in urban areas. Urban areas are the main source for anthropogenic aerosols and thus may have higher aerosol loadings than rural areas. A significant urbanization effect has been noted in the evolution of solar radiation and the relative variables of sunshine duration and diurnal temperature range in China [13,55,56]. Using the same classification method of urban and rural stations raised in Wang et al. [13], we found that over 86% of the selected stations are located in urban areas, indicating a strong urbanization effect on the  Another potential reason underlying the relatively larger biases in China than the global average is the fact that most of the solar radiation stations in China are located in urban areas. Urban areas are the main source for anthropogenic aerosols and thus may have higher aerosol loadings than rural areas. A significant urbanization effect has been noted in the evolution of solar radiation and the relative variables of sunshine duration and diurnal temperature range in China [13,55,56]. Using the same classification method of urban and rural stations raised in Wang et al. [13], we found that over 86% of the selected stations are located in urban areas, indicating a strong urbanization effect on the validation results. Averaging only over the rural stations, the discrepancy largely reduces in both In the eastern part of China, the overestimation of SSR in the CM SAF products is probably due to an underestimation of AOD (aerosol optical thickness) in this region with rapid economic growth. Similar conclusions have been drawn by Xia et al. [30] and Wu and Fu [28] regarding the improper representation of aerosols in the satellite algorithms as the main explanation for the overestimation of SSR in various satellite products over eastern China.
In the northwestern part of China where dust-storms are the dominant natural source of aerosol, the high and temporally and spatially variable aerosol loading induced by frequent dust-storm events are not captured by the aerosol climatology used in the retrieval. Similarly, Hayasaka et al. [29] attributed the negative biases noted in the GEWEX-SRB product over the desert areas of western China, that is, northwestern China, to the difficulty in evaluating AOD (aerosol optical thickness) on the variable and high-albedo surface. Wu and Fu [28] suggested the underestimation of SSR in northwestern China to be a result from an overestimation of cloud amount, despite the fact shown in their Figure 3 that the underestimation mainly occurs in the dust storm active seasons.
In southwestern China, especially the Tibetan Plateau, the observed negative biases might be related to the neglect of elevation effects in the satellite algorithms and degraded data quality under snow-covered surfaces. Yang et al. [32,33] highlighted the difficulties of accurate satellite retrievals in the Himalayas region for the highly variable terrain and elevation-relevant atmospheric environment. In the CM SAF SSR retrievals, the reduced atmospheric scattering at higher elevations is not considered. The reduced absorption, mainly from water vapor, is considered in the integrated water vapor column from the ERA-Interim data set that includes the impact of elevation in the water vapor column, even though at a coarser spatial resolution. Negative biases of the CM SAF SSR have also been reported in the Alpine region [17,54]. Seen from Figure 6c, MBD shows the highest correlation of r > 0.7 with elevation (p < 0.05). This might also explain the exception in northwestern China that several positive biases can still be seen in the low-elevation area ( Figure 5).
Besides, an exceptionally large negative bias of −32.4 W m −2 for CLARA-A2 and −48.9 W m −2 for SARAH-E was observed in the Emeishan station (no.32 in Figure 2), deployed at the top of Mt. Emei with elevation of 3047 m. Besides seasonal snow, a common case for high-latitude areas, the Mt. Emei is famous for the so-called Cloud Sea during November to February, constituted mostly of low clouds at an elevation lower than 2000 m and thus representing heterogeneous cloud patterns [28]. Considering the non-representativeness of this station location, we suggest an exclusion of this station for future works which aim at evaluating the satellite-derived estimates. After excluding the Emeishan station, the validation results are 10.7 ± 1.5 W m −2 MBD, 16.5 ± 0.8 W m −2 MABD, 20.0 ± 0.8 W m −2 RMSD for CLARA-A2; and 8.7 ± 1.5 W m −2 MBD, 14.4 ± 0.6 W m −2 MABD, 17.9 ± 0.7 W m −2 RMSD for SARAH-E on the average. In addition, the higher accuracy of SARAH-E as compared to CLARA-A2 is also indicated in the comparisons at the individual stations ( Figure 5). The ratio of the stations with MABD fulfilling the accuracy threshold of 15 W m −2 is 42% and 63% for CLARA-A2 and SARAH-E, respectively.
Another potential reason underlying the relatively larger biases in China than the global average is the fact that most of the solar radiation stations in China are located in urban areas. Urban areas are the main source for anthropogenic aerosols and thus may have higher aerosol loadings than rural areas. A significant urbanization effect has been noted in the evolution of solar radiation and the relative variables of sunshine duration and diurnal temperature range in China [13,55,56]. Using the same classification method of urban and rural stations raised in Wang et al. [13], we found that over 86% of the selected stations are located in urban areas, indicating a strong urbanization effect on the validation results. Averaging only over the rural stations, the discrepancy largely reduces in both datasets of CLARA-A2 and SARAH-E (Table 3). To test possible random effects from the spatial distribution of the rural stations, we also calculated the metrics for their nearby urban stations displayed in Figure 2. The results still show a smaller discrepancy in rural stations as compared to the corresponding nearby urban stations (Table 3). Unfortunately, the limited number of only eight rural stations mainly located in the northern part of China hindered a national-scale exploration of the urbanization effect ( Figure 2). Furthermore, the availability of rural stations further decreases for the validation of the SARAH-E product, due to the limited spatial and temporal coverage of SARAH-E over China (Figure 3). Despite these deficiencies, an enhanced agreement between the CM SAF satellite-derived radiation and surface observations can be expected if more rural stations are available in China for the validation. Besides, some other operational effects that were not detected or excluded from this study might also exist and contribute to the discrepancies with satellite estimates. Table 3. Urbanization effects on the CLARA-A2 and SARAH-E validation results. MBD: mean bias deviation; MABD: mean absolute bias deviation; RMSD: root mean square deviation.

Comparison of Trends from Satellite and Surface
Designed for the study of climate trends, temporal stability is a critical property of the CM SAF radiation datasets [17].  [24]. However, the overestimation of SSR decreases not only for CLARA-A2 but also for SARAH-E during the last decade, suggesting a common phenomenon in the satellite-based records. Meanwhile, a transition to decrease in the PM 2.5 concentrations has been reported in China after 2005 [57]. Considering the application of a temporal-constant aerosol climatology in the satellite retrievals, the decreasing (increasing) bias in the post-2004 (pre-2004) period is very likely due to the neglecting of the decrease (increase) in aerosol and its attenuating effect on SSR in China. Similarly, a change point in the time evolution of the discrepancy between satellite estimates and surface measurements from a stable to a sharp increasing trend has been observed in the year of 2009 for India, with the changes in aerosol loading or properties over India unresolved by the SARAH-E aerosol climatology proposed as a possible cause [27]. On the annual basis, the geostationary satellite-based SARAH-E still shows a higher accuracy than the polar-orbiting satellite-based CLARA-A2 product. The annual MABDs of CLARA-A2 are generally higher than the accuracy threshold, with only one year of 2012 meeting the requirement (Figure 7b). In comparison, SARAH-E MABD can basically meet the accuracy threshold in 42% of the period 1999-2015 with a stable trend in general. On the other hand, the trend of RMSD for SARAH-E slightly increased (Figure 7c), possibly due to the reduced spatial samples for China in the SARAH-E product (Figure 3).
Significant correlation between satellite estimates and surface measurements at the monthly scale is indicated in Figure 8. Here, anomalies rather than absolute values were used to exclude the effect from the annual cycle of SSR, which then can give more representative correlation coefficients and better measures the quality of the satellite product [49]. The correlation coefficient r is almost 0.8 for both CLARA-A2 and SARAH-E, suggesting both data records can reasonably reproduce the monthly anomalies of SSR in China. In comparison, lower deseasonalized correlation coefficients ranging from 0.48 to 0.72 were found for other satellite radiation products of GEWEX-SRB, ISCCP-FD, CERES-EBAF and UMD-SRB as compared to the CMA observations over China [49].
Remote Sens. 2018, 10, x FOR PEER REVIEW 13 of 19 the trend of RMSD for SARAH-E slightly increased (Figure 7c), possibly due to the reduced spatial samples for China in the SARAH-E product ( Figure 3). Significant correlation between satellite estimates and surface measurements at the monthly scale is indicated in Figure 8. Here, anomalies rather than absolute values were used to exclude the effect from the annual cycle of SSR, which then can give more representative correlation coefficients and better measures the quality of the satellite product [49]. The correlation coefficient r is almost 0.8 for both CLARA-A2 and SARAH-E, suggesting both data records can reasonably reproduce the monthly anomalies of SSR in China. In comparison, lower deseasonalized correlation coefficients ranging from 0.48 to 0.72 were found for other satellite radiation products of GEWEX-SRB, ISCCP-FD, CERES-EBAF and UMD-SRB as compared to the CMA observations over China [49].  Figure 9 compares the trends derived from the composites of anomalies time series of satellite and surface records at the station locations for the common periods. None of the trends pass the significance test at the 95% confidence level. Surface-measured SSR in China tends to level off since the 1990s, consistent with previous analyses [12,40,58]. CLARA-A2 generally records the levelling off trend in surface data over 59 stations across China for 1993-2015 with a slight increase of 0.84 W m −2 decade −1 , showing a difference of less than 1 W m −2 decade −1 (Figure 9a). For the observing period of SARAH-E since 1999, there is a more obvious increasing trend in surface-based SSR by 1.45 W m −2 decade −1 , while SARAH-E SSR slightly decreases by −0.71 W m −2 decade −1 (Figure 9b). Limiting the period to 1999-2015 as used for SARAH-E, the SSR trend between CLARA-A2 and CMA shows a similarly opposite direction (Figure 9a). A varying aerosol trend in China might have contributed to the larger discrepancy between satellite-derived and surface-observed SSR trends in the latter period, which is also indicated in Figure 7. The Breathing Earth System Simulator (BESS) product reported no trend (p > 0.1) of SSR over China between 2001 and 2016 [59], consistent with the sunshine duration derived SSR trend [39]. The CM SAF radiation datasets have been proven to be able to detect the brightening trend over Europe, which in the CM-SAF framework is primarily related to changes in clouds [9,10]. In China with aerosol as the dominant factor for the decadal trends in SSR [60], an inclusion of aerosol variability in the satellite algorithms might be necessary for an accurate detection of changes in SSR.  Figure 9 compares the trends derived from the composites of anomalies time series of satellite and surface records at the station locations for the common periods. None of the trends pass the significance test at the 95% confidence level. Surface-measured SSR in China tends to level off since the 1990s, consistent with previous analyses [12,40,58]. CLARA-A2 generally records the levelling off trend in surface data over 59 stations across China for 1993-2015 with a slight increase of 0.84 W m −2 decade −1 , showing a difference of less than 1 W m −2 decade −1 (Figure 9a). For the observing period of SARAH-E since 1999, there is a more obvious increasing trend in surface-based SSR by 1.45 W m −2 decade −1 , while SARAH-E SSR slightly decreases by −0.71 W m −2 decade −1 (Figure 9b). Limiting the period to 1999-2015 as used for SARAH-E, the SSR trend between CLARA-A2 and CMA shows a similarly opposite direction (Figure 9a). A varying aerosol trend in China might have contributed to the larger discrepancy between satellite-derived and surface-observed SSR trends in the latter period, which is also indicated in Figure 7. The Breathing Earth System Simulator (BESS) product reported no trend (p > 0.1) of SSR over China between 2001 and 2016 [59], consistent with the sunshine duration derived SSR trend [39]. The CM SAF radiation datasets have been proven to be able to detect the brightening trend over Europe, which in the CM-SAF framework is primarily related to changes in clouds [9,10]. In China with aerosol as the dominant factor for the decadal trends in SSR [60], an inclusion of aerosol variability in the satellite algorithms might be necessary for an accurate detection of changes in SSR.   Figure 9 compares the trends derived from the composites of anomalies time series of satellite and surface records at the station locations for the common periods. None of the trends pass the significance test at the 95% confidence level. Surface-measured SSR in China tends to level off since the 1990s, consistent with previous analyses [12,40,58]. CLARA-A2 generally records the levelling off trend in surface data over 59 stations across China for 1993-2015 with a slight increase of 0.84 W m −2 decade −1 , showing a difference of less than 1 W m −2 decade −1 (Figure 9a). For the observing period of SARAH-E since 1999, there is a more obvious increasing trend in surface-based SSR by 1.45 W m −2 decade −1 , while SARAH-E SSR slightly decreases by −0.71 W m −2 decade −1 (Figure 9b). Limiting the period to 1999-2015 as used for SARAH-E, the SSR trend between CLARA-A2 and CMA shows a similarly opposite direction (Figure 9a). A varying aerosol trend in China might have contributed to the larger discrepancy between satellite-derived and surface-observed SSR trends in the latter period, which is also indicated in Figure 7. The Breathing Earth System Simulator (BESS) product reported no trend (p > 0.1) of SSR over China between 2001 and 2016 [59], consistent with the sunshine duration derived SSR trend [39]. The CM SAF radiation datasets have been proven to be able to detect the brightening trend over Europe, which in the CM-SAF framework is primarily related to changes in clouds [9,10]. In China with aerosol as the dominant factor for the decadal trends in SSR [60], an inclusion of aerosol variability in the satellite algorithms might be necessary for an accurate detection of changes in SSR.
Remote Sens. 2018, 10, x FOR PEER REVIEW 15 of 19 Figure 9. Comparisons of the trends in the monthly SSR anomalies between the collocated CLARA-A2 (a, red lines)/SARAH-E (b, blue lines) and CMA (black lines) datasets for the common periods.
Values are the linear decadal trend slopes and the 95% confidence intervals (W m −2 decade −1 ).

Conclusions
With the advantage in spatial and temporal coverage, CLARA-A2 allows to study the radiation climatology and trends since the early 1990s over the whole China. A better performance regarding accuracy is shown in the geostationary satellite-based SARAH-E with higher spatial and temporal resolution as compared to CLARA-A2. The limited spatial and temporal coverage of SARAH-E, however, prevents the validation for the northeastern part of China and most of the 1990s.
Due to the high aerosol loading and complex terrain in China, the discrepancy between satelliteretrieved and surface-based SSR data is larger than in most other regions of the world. Averaged over all 59 stations used in this study, the comparison shows an overestimation of 10.0 W m −2 by CLARA- Figure 9. Comparisons of the trends in the monthly SSR anomalies between the collocated CLARA-A2 ((a), red lines)/SARAH-E ((b), blue lines) and CMA (black lines) datasets for the common periods. Values are the linear decadal trend slopes and the 95% confidence intervals (W m −2 decade −1 ).

Conclusions
With the advantage in spatial and temporal coverage, CLARA-A2 allows to study the radiation climatology and trends since the early 1990s over the whole China. A better performance regarding accuracy is shown in the geostationary satellite-based SARAH-E with higher spatial and temporal resolution as compared to CLARA-A2. The limited spatial and temporal coverage of SARAH-E, however, prevents the validation for the northeastern part of China and most of the 1990s.
Due to the high aerosol loading and complex terrain in China, the discrepancy between satellite-retrieved and surface-based SSR data is larger than in most other regions of the world. Averaged over all 59 stations used in this study, the comparison shows an overestimation of 10.0 W m −2 by CLARA-A2 and 7.5 W m −2 by SARAH-E. One of our most interesting findings is a strong urbanization effect behind the large positive bias in China. The bias largely decreases to −2.1 W m −2 for CLARA-A2 and −3.2 W m −2 for SARAH-E if only the rural stations available are considered. An underestimation of aerosol effects in the satellite retrievals over China is also evident in the seasonal performance. Indicated by all three metrics of MBD, MABD and RMSD, the largest relative deviation is found in winter, followed by spring, autumn and summer, consistent with the seasonal cycle of aerosol concentrations in China. Spatially, the overestimation of SSR in the satellite estimates is not found over the whole of China but mainly occurs in the eastern part featured by a strong variability and high absolute magnitude of anthropogenic aerosols. In contrast, in the western part of China an underestimation is prevalent, most likely related to the difficulty to capture the frequent dust-storms in the satellite retrievals over the desert regions (i.e., northwestern part) and the neglect of reduced atmospheric scattering and degraded data quality under snow-covered surfaces of the high-elevation regions (i.e., southwestern part).
Both CLARA-A2 and SARAH-E datasets are generally capable of reproducing the monthly anomalies of SSR measured at the surface, indicated by a significant anomaly correlation around 0.8 at both annual and seasonal scales. The satellite products can estimate the SSR trends in China with an insignificant (p > 0.05) difference of about 0.8 W m −2 decade −1 for 1993-2015 and −2.2 W m −2 decade −1 for 1999-2015. The increased discrepancy in the latter period is a result of an increase in surface-observed SSR but a slight decrease in satellite estimates. As also evidenced by a decrease in the overestimation of SSR in the last decade, a weakening in the attenuation effect of aerosols might have caused the brightening observed in surface measurements, which however fails to be recorded by the satellite products using a temporally constant aerosol climatology. Therefore, an inclusion of aerosol variability in the satellite algorithms is crucial for an accurate reproduction of the solar dimming/brightening phenomenon especially for regions like China with non-negligible aerosol radiative effects.