Validation of Hourly Global Horizontal Irradiance for Two Satellite-Derived Datasets in Northeast Iraq

Several sectors need global horizontal irradiance (GHI) data for various purposes. However, the availability of a long-term time series of high quality in situ GHI measurements is limited. Therefore, several studies have tried to estimate GHI by re-analysing climate data or satellite images. Validation is essential for the later use of GHI data in the regions with a scarcity of ground-recorded data. This study contributes to previous studies that have been carried out in the past to validate HelioClim-3 version 5 (HC3v5) and the Copernicus Atmosphere Monitoring Service, using radiation service version 3 (CRSv3) data of hourly GHI from satellite-derived datasets (SDD) with nine ground stations in northeast Iraq, which have not been used previously. The validation is carried out with station data at the pixel locations and two other data points in the vicinity of each station, which is something that is rarely seen in the literature. The temporal and spatial trends of the ground data are well captured by the two SDDs. Correlation ranges from 0.94 to 0.97 in all-sky and clear-sky conditions in most cases, while for cloudy-sky conditions, it is between 0.51–0.72 and 0.82–0.89 for the clearness index. The bias is negative for most of the cases, except for three positive cases. It ranges from −7% to 4%, and −8% to 3% for the all-sky and clear-sky conditions, respectively. For cloudy-sky conditions, the bias is positive, and differs from one station to another, from 16% to 85%. The root mean square error (RMSE) ranges between 12–20% and 8–12% for all-sky and clear-sky conditions, respectively. In contrast, the RMSE range is significantly higher in cloudy-sky conditions: above 56%. The bias and RMSE for the clearness index are nearly the same as those for the GHI for all-sky conditions. The spatial variability of hourly GHI SDD differs only by 2%, depending on the station location compared to the data points around each station. The variability of two SDDs is quite similar to the ground data, based on the mean and standard deviation of hourly GHI in a month. Having station data at different timescales and the small number of stations with GHI records in the region are the main limitations of this analysis.


Introduction
Global horizontal irradiance (GHI), both in the atmosphere and on the earth's surface, is a crucial parameter in the fields of atmosphere interaction, solar energy, architecture, and agriculture.
High-resolution GHI data are required for studying those fields.Therefore, several studies have tried to estimate solar radiation (SR) and its components from either ground measurements or satellite images using several models [1][2][3][4][5][6].The ground measurements of GHI have high accuracy and high temporal availability, whereas the high spatial resolution of recorded data and the number of stations with SR data are limited in most geographical areas.The reasons are the purchase and high maintenance costs of pyranometers.Satellite images have been analysed to estimate GHI in order to cover the scarcity of ground measurement data.Most of the affordable satellite images for that purpose are the geostationary satellite images, namely Meteosat First Generation (MFG) and Meteosat Second Generation (MSG)/Spinning Enhanced Visible and Infrared Imager (SEVIRI), The Japanese Geostationary Meteorological Satellite (GMS), and the Geostationary Operational Environmental Satellite system (GOES) [7].Hence, others such as the Moderate Resolution Imaging Spectroradiometer (MODIS) [8] and Landsat images have been used [9], but their temporal resolution is not acceptable.
The basic idea of estimating GHI from satellite images is to find the relationship between satellite images and ground measurements, either with statistical or physical approaches [10].The popular method of Heliosat-2 (H2), which is based on developing Heliosat-1 to be more physical than empirical, can be, applied to large time-series data of meteorological satellites.The H2 principle is that the radiance of a cloud pixel is high in the visible band.It tests the difference of the reflectance between the cloud pixel and the clear-sky pixel; this is called a cloud index.This data and the data of the Linke turbidity factor are used to measure GHI [11].The H2 has been developed by changing some inputs to the model in several studies [12][13][14].Other studies also used satellite imagery with different techniques for GHI estimations [15][16][17].
There are several satellite-derived datasets (SDDs) for establishing, measuring, modelling and estimating GHI, which can be found in [18,19].An SDD from MSG has been used to create a solar map [20].Studies which merged ground data with the SDD for the same above purpose reveal that the merging technique for producing a solar map is better than interpolating ground data [21,22].SDDs have also been combined with meteorological data to calibrate a GHI model [23].The same data combinations have also been analysed to create GHI datasets for crop modelling over Europe [24].SDDs have been utilised to assess long-term trends of a GHI time series [25].SDDs are quite useful because of the limitations of ground data for GHI applications.
SDDs are necessary for many fields because they provide GHI for many areas and countries.Therefore, the validation of SDDs is crucial to investigate their reliability by using various methods in different geographical and climate areas, and several prior studies achieved that.For instance, the Satellite Application Facility on Climate Monitoring (CM SAF) dataset based on predicting GHI from MFG and MSG has been validated with the ground data at several stations in Europe in the period 1983-1985 [26].The data from the same dataset with the H2 method for converting satellite images to GHI and others (SolarGis and Solemi) have also been validated in 22 cities in Europe [27].Similarly, the CM SAF dataset of GHI has been compared against ground data at 20 stations in Sweden and Norway, and the result reveals good agreement with an accuracy of 15 W/m 2 , corresponding to an error of roughly 8% [28].In addition, GHI retrievals from different CM SAF products have been validated against ground measurements at eight sites in Europe, under various sky conditions.[29].More broadly, Zhang et al. [19] have evaluated the result of the six re-analysed datasets for obtaining GHI with the ground data from measurement networks such as the Baseline Surface Radiation Network (BSRN) and others for 674 cities around the globe, with an overall bias found to be from 11-50 W/m 2 .Hence, they used a large volume of data in various climate regions and countries; however, the results are shown according to the datasets and measurement networks, rather than for each station.This is useful for comparison between the SDDs and measurement networks, but it does not reflect a real situation for the individual stations.Moradi et al. [30] also estimated daily GHI with the H2 method in Iran, evaluating the result of the model with four stations in the country, which revealed a good agreement with 12% RMSE and 2% bias.Schillings et al. [31] validated direct normal irradiance (DNI) data at weather stations in eight cities in Saudi Arabia with Meteosat-7 data using the H2 method.The results indicate a good agreement with a mean bias of 4.3% from hourly data.Similarly, AL-Jumaily et al. [32] evaluated the GHI data of two Iraqi weather stations with the same method usingMeteosat-8 data for the year 2005.Positive biases of 0.024 KWh/m 2 and 0.012 KWh/m 2 GHI for daily mean values were found for both cities.The authors indicate that further research comparing Meteosat-8 data with other areas of Iraq is needed.It is necessary for studies to validate more than one SDD for comparison between them, and then select the most accurate one.Recently, GHI data from the MFG and MSG have been evaluated over India, which show an overestimation bias of 10-20% of daily mean [33].Some other studies have evaluated SDDs over the United States in different climate regions against several ground-based measurements [34][35][36][37].
The most popular SDD is that arranged by the Solar Radiation Data (SoDa) portal [38], which contains several projects; one of them is HelioClim-3 version 4 and version 5 (HC3v4-5), which are based on the H2 method for converting satellite images of MSG to GHI.Another is the Copernicus Atmosphere Monitoring Service (CAMS) Radiation Service (CRS), which is based on Heliosat-4 (H4) for the same purpose.
Those SDDs have been validated by several studies in various areas.For example, Thomas et al. [39] have validated the hourly GHI from SDDs such as HC3v4-5 and CRS for 42 stations in Brazil.The result reveals a high correlation (an average of 96%) between HC3v4-5 and ground measurements, whereas that with CRS is lower by 2%.Similarly, r values above 0.92 for 15 min and 0.98 for daily GHI with a bias of roughly 5% were found when comparing HC3v4-5 and CRS to ground data at 14 stations over the world [40].Hourly GHI and DNI from HC3v4-5 for all-sky conditions, and using the McClear dataset for clear-sky conditions, have been validated with ground data in seven stations over Egypt, with RMSE ranges from 6-22% [41].Marchand et al. [42] have validated hourly GHI from HC3v4-5 and CRS with ground data at five stations in United Arab Emirates and Oman.The overall validation result is nearly 15% of the RMSE on average.
This study aims to validate the hourly GHI from HC3v5 and CRSv3 against ground measurements at nine stations in the northeast of Iraq, being the first study validating those SDDs in that region.One objective of this study is to evaluate the spatial-temporal performance of those datasets in all-sky, clear-sky, and cloudy-sky conditions and with the clearness index.Another objective is to use a new approach for validation, which is limited in the literature, comparing the GHI from ground measurements at a station against the GHI from SDDs at each station location and the two points around it, at a spatial resolution of 5 km (corresponding to that of MSG imagery), and with each point collected from a different pixel.
The study is organised as follows.The study location, ground data and SDDs are described in Section 2. The validation results are shown in Section 3. The discussion is set out in Section 4, and finally a conclusion is provided.
The hourly GHI data with some other climate parameters were collected from two station types.First, the data are from tower stations.The pyranometer used for recording data in these stations is the Kipp and Zonen CMP6 Pyranometer.The data were collected for the period 2011-2014 from five stations, which lacked some years, from the Ministry of Electricity-Kurdistan Regional Government (KRG) (Table 1).Others are automatic stations equipped with an Vaisala QMS101 Pyranometer.The data were collected from the General Directorate of Meteorology and Seismology-KRG from four stations (2013-2016), which lacked some years (Table 2).Note: The periods in the table are available for the ground measurements; the SDDs for the same periods have been collected for each station location and points around the stations, which were used for validation.

Quality Controal of GHI Measurments
Data normalising and cleaning were done by setting the solar elevation angle above 15 • .Missing values were found and set as not applicable (NA).The two datasets were harmonised for true local solar time.All of the GHI ground data were tested with the BSRN tests [45] and other quality control tests [46,47].Full information about the quality control of the ground data in this study can be found in [46].Systematic errors were removed from the data, and some questionable values of data according to various tests were not used in the validation process.

Satellite-Derived Datasets
The SoDa portal [38] is owned by MINES ParisTech and Transvalor.It provides a dataset of solar radiation components, which are based on converting satellite images of MSG in the field view of the SEVIRI instrument covering Europe, Africa, the Middle East and part of South America (Figure 2) by the HC3 and CRSv3 datasets.The hourly GHI data from HC3v5 and CRSv3 for each station location and for points around each station have been collected from the SoDa website, based on the available period of ground data.Note: The periods in the table are available for the ground measurements; the SDDs for the same periods have been collected for each station location and points around the stations, which were used for validation.

Quality Controal of GHI Measurments
Data normalising and cleaning were done by setting the solar elevation angle above 15°.Missing values were found and set as not applicable (NA).The two datasets were harmonised for true local solar time.All of the GHI ground data were tested with the BSRN tests [45] and other quality control tests [46,47].Full information about the quality control of the ground data in this study can be found in [46].Systematic errors were removed from the data, and some questionable values of data according to various tests were not used in the validation process.

Satellite-Derived Datasets
The SoDa portal [38] is owned by MINES ParisTech and Transvalor.It provides a dataset of solar radiation components, which are based on converting satellite images of MSG in the field view of the SEVIRI instrument covering Europe, Africa, the Middle East and part of South America (Figure 2) by the HC3 and CRSv3 datasets.The hourly GHI data from HC3v5 and CRSv3 for each station location and for points around each station have been collected from the SoDa website, based on the available period of ground data.

HelioClim-3 (HC3)
The HC3 dataset has been created by converting MSG images to estimate the GHI for every 15 min since February 2001 using the original H2 method.The principle of H2 is to calculate solar radiation statistically by the cloud cover index, which is created by the reflectance in the visible image of MSG and ground albedo [30].The method has been modified several times by various inputs.It initially refers to Cano et al. [10], and a new method was published in [11].The MSG image processing in this model gives GHI.Then, a DNI and diffuse horizontal irradiance are estimated [41].
The most common version of HC3 is v4 and v5.V4 inputs are the clear-sky model of the European Solar Radiation Atlas (ESRA) and the Linke turbidity factor.One limitation of this release is that it is not detecting a local effect on the Linke turbidity factor [48].The clear-sky model gives solar radiation globally every three hours as in the free cloudy-sky [49].HC3v5 works largely on the same principle as HC3v4, but is different because it uses the McClear model [42].McClear is also a model for providing solar radiation under clear-sky conditions.It counts the optical depth of atmosphere as a column, which contains aerosol, water vapour and ozone.It is provided by the Copernicus atmosphere monitoring service [50].The data within those datasets are Available online [38] for MSG coverage for free from 2004 to 2006, and with payment from 2007 upwards.

Copernicus Atmosphere Monitoring Service (CAMS), Radiation Service (CRS)
CRS is a dataset of solar radiation components, which provides Heliosat-4 data using the satellite images of MSG.H4 is a modified method of their previous version.Ground albedo from MODIS and the McClear model are used in this method [39,48,49].The data are available for free from 2004 until two days before for the areas covered by MSG images.The third version of CRS is available after bias correction [38].This study has used CRS version 3 (CRSv3).Further information about the HC3v4-5 and CRSv3 projects can be found at SoDa [38] and [39][40][41][42]48,49,51].

Validation Criteria
The validation approach is illustrated in Figure 3.Most of the previous studies for validation of GHI SDD against ground data have separated data into all-sky and clear-sky conditions [27,31,41].The division also depends on the clearness index (Kt).The Kt is calculated by dividing hourly GHI ground data to the top-of-atmosphere radiation on the horizontal surface (TOA).The TOA was collected from SoDa [38].For calculating the Kt of SDDs see Figure 3.The Kt was used for validation and setting limits among the various sky conditions [27,52] as below: This study separates the ground data into all-sky, clear-sky and cloudy-sky conditions based on the above Kt limits.This is to test the SDDs in various situations and to demonstrate under which situations the SDDs are the most accurate.
The approach uses the ground data of a station to assess the SDDs with data from the station location pixel and with another two points of SDD pixel data.One pixel data point is selected to the east and another is selected to the west of a station at a distance of 6-10 km, (Tables 1 and 2, Figure 4).This is to select a different pixel from the station location pixel; given the spatial resolution of MSG imagery is 5 km in the case study region (Figure 2).Hereafter, P1 is called the west point for each station, and P2 is called the east point.This is for further investigation into the validation of SDD for more than one-pixel around the station and to address whether the SDD values from neighbouring pixels are the same or different.This is because the solar radiation intensity may be the same in an area of 25 km 2 [23,53].
The validation performance between ground data and SDDs, namely HC3v5 and CRSv3, have been evaluated by statistical indicators, being the correlation coefficient (r) in Equation ( 1), the bias in Equation ( 2), and the relative bias in Equation ( 3), the root mean square error (RMSE) in Equation ( 4), and the relative RMSE (rRMSE) in Equation ( 5) [48,54] for the all-sky conditions for hourly GHI for the stations and points around the stations and clearness index for all-sky conditions at stations, and the GHI for the clear-sky and cloudy conditions at the stations.
where n = the number of observations, Xi = the GHI of ground data and Yi = the GHI of a SDD.The performance of two SDDs against the ground data have also been assessed in all-sky conditions to demonstrate the variability within reproducing the ground data by SDDs by using the hourly mean and standard deviation of GHI in a month.The monthly mean and standard deviation of hourly GHI were calculated for ground data and SDDs for each month in the selected period of a station.
Remote Sens. 2018, 10, x FOR PEER REVIEW 7 of 22 (4), and the relative RMSE (rRMSE) in Equation ( 5) [48,54] for the all-sky conditions for hourly GHI for the stations and points around the stations and clearness index for all-sky conditions at stations, and the GHI for the clear-sky and cloudy conditions at the stations. (1) where n = the number of observations, Xi = the GHI of ground data and Yi = the GHI of a SDD.The performance of two SDDs against the ground data have also been assessed in all-sky conditions to demonstrate the variability within reproducing the ground data by SDDs by using the hourly mean and standard deviation of GHI in a month.The monthly mean and standard deviation of hourly GHI were calculated for ground data and SDDs for each month in the selected period of a station.

Results
The results of validating SDDs against ground measurement at nine stations in the northeast of Iraq are shown as follows.Table 3 represents the results of the hourly GHI in all-sky conditions for the stations and the points around them; Table 4 represents the results of the clearness index in allsky conditions for the stations; and Table 5 represents the results of the hourly GHI in clear-sky and cloudy-sky conditions for the stations.Figures 5 and 6 show the results of the hourly mean and standard deviation in a month for each SDD with ground data.Figures 7 and 8 give further results between the stations with the points around them, and between SDDs in all-sky, clear-sky and cloudy-sky conditions for the results of the validation percentages of the bias and the RMSE, respectively.The results of some stations as examples with scatterplots are shown in Figures 9-11 for the GHI and clearness index in all-sky conditions, and the GHI in clear-sky and cloudy-sky conditions.
Overall, the study is focused on all-sky conditions to show the results in different ways such as within the clearness index, the mean and standard deviation in a month and other statistical indicators, which have been used in all three sky conditions.This is to avoid complex results when presenting all of the above data in a variety of sky conditions.

Results
The results of validating SDDs against ground measurement at nine stations in the northeast of Iraq are shown as follows.Table 3 represents the results of the hourly GHI in all-sky conditions for the stations and the points around them; Table 4 represents the results of the clearness index in all-sky conditions for the stations; and Table 5 represents the results of the hourly GHI in clear-sky and cloudy-sky conditions for the stations.Figures 5 and 6 show the results of the hourly mean and standard deviation in a month for each SDD with ground data.Figures 7 and 8 give further results between the stations with the points around them, and between SDDs in all-sky, clear-sky and cloudy-sky conditions for the results of the validation percentages of the bias and the RMSE, respectively.The results of some stations as examples with scatterplots are shown in Figures 9-11 for the GHI and clearness index in all-sky conditions, and the GHI in clear-sky and cloudy-sky conditions.
Overall, the study is focused on all-sky conditions to show the results in different ways such as within the clearness index, the mean and standard deviation in a month and other statistical indicators, which have been used in all three sky conditions.This is to avoid complex results when presenting all of the above data in a variety of sky conditions.

All-Sky Conditions
The results of the validation for the all-sky conditions are presented in Table 3.The correlation is from 0.94 to 0.97 in all of the stations and 0.92-0.97 in the points around them.Interestingly, zero bias was recorded at Hojava station, and it was near zero in several other cases (Halsho, Maydan and Bazian stations) for both SDDs.Negative (underestimation) bias was recorded in several cases.It ranges from −21 W/m 2 (−4%) to −3 W/m 2 (−0.6%) for HC3v5, which is lower than CRSv3 in most cases, which ranges from −27 W/m 2 (−5.3%) to −2 W/m 2 (−0.4%).Moreover, it was lower than −6% for all of the stations.However, a positive (overestimation) bias was recorded in some of the cases.The highest case of bias in the study was recorded at Kalar station, which was 25 W/m 2 (5.3%) for CRSv3 and 21 W/m 2 (4.4%) for HC3v5.Some other positive rates were recorded at Halsho and Maydan stations, which were lower than 2% for both SDDs (Figure 7, Table 3).
The bias at each station was compared to the points around the station, and was nearly the same with no more than 4% difference for each station.Overall, the rate of bias in HC3v5 was less than that in CRSv3 (Figure 7, Table 3).
The RMSE was under 21% in all of the cases.Its lowest range, at Enjaksor, was 64 W/m 2 (12%), and increased to the highest value of 88 W/m 2 (19%) at Kalar station for HC3v5.It was generally high for CRSv3 ranging from 71 W/m 2 (14%) to 100 W/m 2 (20%).Most of the other rates of RMSE were between 14−18% for both SDDs.
The RMSE for the points around the stations compared to the station location are nearly the same (Figure 8).Overall, the RMSE for HC3v5 was less than that for CRSv3 (Figure 8).
The smooth scatter density plot illustrates the residual and correlation between ground data and SDDs in some cases.For example, Figure 9 for Batufa station shows that the density of observations were mostly under the 1:1 line, which indicated a recorded negative bias, and the RMSE was acceptable for HC3v5 and high for CRSv3, while some of the other values above the 1:1 line are under 200 W/m 2 .However, Figure 10 (Kalar station) shows that the majority of observations are above the 1:1 line and some points are far from the line.This corresponds to positive bias and high RMSE in the station.
Low rates of bias and RMSE were recorded at Maydan station, which is shown in Figure 11 for HC3v5 and CRSv3 respectively.The best-fit line in red is nearly the same as the 1:1 line, more so for HC3v5 than CRSv3 (at the same station).
The results of the clearness index in the all-sky conditions are represented in Table 4.The percentages of bias and RMSE were quite similar to the GHI.Hence, the r values at all of the stations were lower than the GHI, which ranged from 0.82-0.89.The r values were higher in HC3v3 than CRSv3 when comparing each station for both.The scatter density plot shows the highest density of observations at around 0.7 W/m 2 for all of the stations.Negative bias at Batufa, Maydan and positive bias at Kalar can be seen, although the values were low.Several values are far from the 1:1 line, resulting in RMSE to be from 15-21% at all of the stations (Figures 9-11).
Results of the mean and standard deviation of the GHI in months are represented in Figure 5 for HC3v5 and in Figure 6 for CRSv3.The two figures demonstrate the distribution of the two SDDs with ground data in each month, expressed by the standard deviation.However, some differences were recorded in the winter months at Batufa, Hojava, Halsho and Bazian stations, whereas for summer months differences were recorded at Maydan and Kalar stations for both SDDs.

Clear-Sky and Cloudy-Sky Conditions
The compared results of the two SDDs for clear-sky and cloudy-sky conditions are presented in Table 5.There are apparent differences in the r values between the clear-sky and cloudy-sky conditions in all of all cases.These ranged from 0.95-0.97for clear-skies, whereas for the cloudy-skies, it ranged from 0.51-0.72 among the stations.
Similarly, the ranges of bias were much higher in the cloudy-skies than in the clear-skies.The lowest bias was a 2 W/m 2 (0.3%) overestimation at Maydan in clear-sky conditions whereas in the same station it reached 30 W/m 2 (24%) in cloudy-sky conditions.The highest bias for the clear-sky conditions was recorded at Batufa station, which was −32 W/m 2 (−4.7%).The same station had the highest bias for the cloudy-sky conditions, which was 89 W/m 2 (84%).These were recorded for HC3v5.The bias for CRSv3 was the same as HC3v3 for the low range in the clear-sky conditions.In others, the ranges start from −45 W/m 2 (−6.6%) underestimation at Batufa station to 7 W/m 2 (1%) overestimation at Kalar station, while for cloudy-skies, it ranged from 33 (29%) to 90 (85%) W/m 2 respectively.The variety of the two SDDs in term of bias is shown in Figure 7, which illustrates a moderate difference between cloudy-sky and clear-sky conditions from one station to the other.In addition, the range of bias was much lower in clear-skies than cloudy-skies and the bias in all-sky conditions was lower than that in clear-skies except for two stations for both SDDs where the bias was higher to some degree in the all-sky conditions than in the clear-skies (Figure 7).
Similarly, the RMSE was much higher in the cloudy-skies than in clear-skies for the SDDs in each station.For example, at Halsho station for CRSv3 it was 61 W/m 2 (9%) in clear-skies and increased sharply to 127 W/m 2 (120%) under cloudy conditions.Nearly the same situation can be seen for Batufa station for HC3v5.The RMSE for both SDDs was lower than 14% at all of the stations for the clear-skies while it was above 58% for the cloudy-skies (Figure 8, Table 5).The RMSE in clear-skies was lower than in the case of all-sky conditions in all of the study areas for both SDDs (Figure 8).
The smooth scatter density plot for Batufa, Kalar and Maydan of both SDDs separately shows that the density of observations were above the 1:1 line, and the direction of distribution was towards the high value of SDD which recorded high overestimation and high RMSE in cloudy-sky conditions (Figures 9-11).In contrast, in the clear-sky conditions for nearly all of the stations, the distribution of observations was near the 1:1 line.This leads to low RMSE in clear-skies compared to cloudy-sky and all-sky conditions (Figures 9-11).The observations under the 1:1 line illustrated a negative bias at Batufa station, whereas the opposite-i.e., positive bias-occurred at Kalar station, and a minimal bias was seen at Maydan station relative to the normal distribution.
Overall, the results of the validation varied from one station to another, and they are acceptable according to bias and RMSE.At some stations, the results were disappointing.The points around the stations had nearly the same ranges of bias and RMSE compared to the station location.In most of the cases, the bias and RMSE of HC3v5 were lower than CRSv3.The bias and RMSE were lower for the clear-sky and all-sky conditions than for the cloudy-skies.

Discussion
The validation results demonstrate good agreement between the ground data and SDDs in all-sky and clear-sky conditions (average r = 0.95, bias under 6% and RMSE under 21%), unlike the results for the cloudy-sky conditions (average r = 0.61, bias above 16% and RMSE above 61%).The results from the two neighbouring points at each station are close to the results at the station location with an average difference of 2%.Overall the performance of SDDs are in agreement with those from similar studies in other areas [39,41,42,48], also showing a better performance for HC3v5 over CRSv3 (Figures 7 and 8).This is mainly related to the inputs into the models for creating each dataset, whether it is H2 or H4 (see Section 2.3).

All-Sky Conditions
The high rates of positive bias and RMSE at Kalar compared to all of the other stations (Figures 7-11) might be due to the quality of the recorded data [46] and a partial shadow on the sensor at that station, because its mean is lower than that at the other stations by nearly 30 W/m 2 (Table 3), while those data have passed the quality check.However, a similar positive bias and RMSE are reported by other studies [41,42].The low rates of bias (0-2%) were recorded for HC3v5 at the stations of Hojava, Halsho, Bazian and Maydan, and for CRSv3 for the same stations except for Hojava, whose bias rate reaches −3.8% for all-sky conditions (Table 3, Figure 7).These show that the GHI ground data are explained well by satellite data (Figures S1-S5), while the resolution of satellite imagery in that area is 5 km (Figure 2).The underestimations of bias (1-6%) and RMSE (12-20%) were recorded at most stations (Table 3, Figures 7 and 8).Comparable percentages were recorded in similar studies in other areas and climate regions, namely Egypt [41], Brazil [39], and some BSRN stations [48].The reasons are partly related to the local condition of the station, inputs to the Heliosat method-especially the atmospheric optical depth owing to its unavailability-, various cloud types, the resolution of satellite images and the aerosols effect [37,39,55].This is because in some cases, the GHI ground data are well explained by SDDs (Figures S1-S5), but in other cases only some error rates were recorded (Table 3).Those rates are quite reasonable for hourly GHI [42].Those rates of bias can be corrected or modified in some ways [14,18,56,57].
The low variabilities between the ground data and both SDDs are seen in Figures 5 and 6.This might be related to the geographical location and climatic condition, and another reason is that the data were aggregated, concealing some random errors between the two datasets.Hence, some error rates in winter months are related to the difficulty of the Heliosat methods to estimate GHI in cloudy conditions [31,51].The performance of the clearness index (Table 4) is nearly the same for GHI in at all-sky conditions, which is related to the above-mentioned reasons.
The interesting side of this study is that the results of the validation for both SDDs with the two neighbouring points at each station separately are slightly closer to those at the station location.The differences range between 0-2% for bias and RMSE for each point at most of the locations (Table 3, Figures 7 and 8).The ±1-4% difference between the station location and the neighbouring points with the station ground data GHI are mainly related to the elevation above sea level for each location.Other factors might be related to local land surface types such as land and water, agriculture and bare soil.This indicates that the GHI from SDDs can be used for regional planning for various purposes, and the ground data GHI can be used for neighbouring areas when there is a limitation of ground data.This validation is also considered to add further weight to the assumption of near uniformity of solar radiation in a 25 km 2 area [23,53].

Clear-Sky and Cloudy-Sky Conditions
The validation results for both SDDs of GHI with the ground data for clear-sky conditions showed good agreement according to RMSE, which decreased at most stations (Figures 7-11).This is partly related to the inputs to the H2 method, especially in incorporating the visible images of MSG in cloud-free conditions into the model.Similarly, the increased performance of HC3v5 for clear-sky conditions has been reported [41] in Egypt.However, the remaining residuals of clear-sky conditions are caused by the factors that have been mentioned in the all-sky conditions above.However, the bias increased to some degree for both SDDs in most of the stations, which were recording underestimations for clear-sky conditions.This is partially related to the increase of the mean GHI ground data in clear-sky conditions.It has also been recorded by several studies [19,27,37], which show that the bias is underestimated for clear-skies.
The study investigated the performance of SDDs on cloudy-sky conditions, reflected in the very low performance of both HC3v5 and CRSv3 according to the high ratio of bias and RMSE (Figures 7 and 8).A close look at the samples of smooth scatter plots (Figures 9-11) shows how far the observations and their density are from the 1:1 line.This is related to difficulties in analysing cloudy pixels of MSG images [31]; the clouds prevented the ground being viewed from the sensor aboard the satellite [42], and as such it is hard to differentiate between cloud albedo and ground albedo [51].These factors lead to an overestimation of GHI as shown in all of the stations (Table 5) for bias, and much higher RMSE (Figures 6 and 7).Indeed, in some of the cases, it is above the mean of the observations.Similar high residuals for cloudy conditions have been reported in the literature [19,27,37,41].This indicates that the GHI ground data are well explained by the SDDs in clear-sky conditions, whereas they are not explained well in cloudy-sky conditions (Figures S6-S15).The results of high bias and RMSE indicate that further research is required to correct the errors under cloudy-sky conditions, whereas several studies have done bias corrections for all-sky conditions [14,18,56,57].
The limitations of this study are the different data timescales from one station to another and the limited information available for some parameters, such as the aerosols and local atmospheric properties.This might lead to a challenge to fully explain the reasons behind the results at each station.
However, the validation results vary from one station to another and are near the World Meteorological Organisation (WMO) standard, whereby the bias should be less than 3 W/m 2 and 95% of errors should not exceed 20 W/m 2 [50].However, the validation results in a minority of stations are above the WMO standard.Therefore, it is probable that the SDDs can be used for modelling and mapping solar radiation with some modification and bias correction.

Conclusions
The study has validated hourly GHI from two SDDs, which are HC3v5 and CRSv3, with ground data from nine stations in northeast Iraq for all-sky, clear-sky and cloudy-sky conditions in the station pixels and with two other pixels around the station in all-sky conditions.The temporal changes of ground data GHI were well represented by both SDDs; r was above 0.94 for the all-sky and clear-sky conditions, and above 0.82 for the clearness index in most cases, while for cloudy-skies it was between 0.51-0.72.The bias was negative (underestimation) for most of the cases except for two HC3v5 and three CRSv3 cases, in which it was positive (overestimation); all of the bias ranges were smaller than 8% (W/m 2 ) of the mean GHI in all-sky and clear-sky conditions, whereas for cloudy-sky conditions, it was positive and varied from one station to another, by 17-85% (W/m 2 ) of the mean GHI.The same applies to RMSE.It ranged between 8-20% (W/m 2 ) in all of the stations for all-sky and clear-sky conditions.In contrast, the range was much higher in cloudy-sky conditions: above 56%.The differences between neighbouring pixels and at-station pixels in the SDDs compared to the ground data of GHI for each site are very small, varying by 2% in most cases.The overall performance of HC3v5 is better than that of CRSv3.
Despite the ratio of errors at some stations, the SDDs are closely related to the ground data at most of the stations.However, the resolution of MSG images is 5 km in the case study.The SDDs represent hourly GHI well, and this can be used to map solar resources and possibly for modelling GHI with ground data in areas with a low number of stations.
Further studies about the bias corrections in the SDDs are needed, especially for cloudy-sky conditions.Further research would also be useful for validating the SDDs in other climates.Some studies are also required to address the inputs to the Heliosat method, according to regional and local factors, for a better estimation of GHI from satellite images.

Supplementary Materials:
The following are Available online at http://www.mdpi.com/2072-4292/10/10/1651/s1, Figures S1-S5: Cumulative frequency function of ground data compared to SDDs for all-sky conditions for the available period of data pairs.The closer red and green lines (SDDs) are to the black line (ground data) shows better performance of SDDs.A difference between lines shows the errors.

Funding:
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Acknowledgments:
The Higher Committee for Education Development in Iraq (HCED) funded this study as a scholarship, the Center for Landscape and Climate Research (CLCR) and the National Centre for Earth Observation (NERC) supported it.The authors are extremely grateful for the assistance of the Directorate of Meteorology-Sulaymaniyah and KRG Ministry of electricity for providing meteorological data.The authors are grateful to Soda Service for allowing access and free use of GHI SDD of CRSv3 data and for a subscription to use the HC3v5 data.A special thanks to Prof. Lucien Wald, MINES ParisTech-France for his oral advice towards the study.

Conflicts of Interest:
The authors declare no conflict of interest.

Figure 3 .
Figure 3.The flowchart of the approach.Figure 3. The flowchart of the approach.

Figure 3 .
Figure 3.The flowchart of the approach.Figure 3. The flowchart of the approach.

Figure 4 .
Figure 4. Example of point pixel selection of SDD around Bazian station.

Figure 4 .
Figure 4. Example of point pixel selection of SDD around Bazian station.

Figure 5 .
Figure 5. Monthly mean and standard deviation of hourly GHI data in each month aggregated over the data availability for each station with HC3v5.The difference between dots reveals the errors in a month and vice versa.If the dot of the SDD in a month is above the dot of the ground data, it denotes overestimation; otherwise, it denotes underestimation.

Figure 5 . 22 Figure 5 .
Figure 5. Monthly mean and standard deviation of hourly GHI data in each month aggregated over the data availability for each station with HC3v5.The difference between dots reveals the errors in a month and vice versa.If the dot of the SDD in a month is above the dot of the ground data, it denotes overestimation; otherwise, it denotes underestimation.

Figure 6 .
Figure 6.Monthly mean and standard deviation of hourly GHI data in each month aggregated over data availability for each station with CRSv5.The difference between dots reveals errors in a month and vice versa.If the dot of the SDD in a month is above the dot of the ground data, it denotes overestimation; otherwise, it denotes underestimation.

Figure 6 .
Figure6.Monthly mean and standard deviation of hourly GHI data in each month aggregated over data availability for each station with CRSv5.The difference between dots reveals errors in a month and vice versa.If the dot of the SDD in a month is above the dot of the ground data, it denotes overestimation; otherwise, it denotes underestimation.

Figure 7 .
Figure 7.Comparison of rBias for the hourly GHI for all-sky conditions among stations with points around them for HC3v5 and CRSv3.Clear-skies and cloudy-skies at stations are represented by blue, light blue, black and grey colours, respectively.

Figure 7 . 22 Figure 6 .
Figure 7.Comparison of rBias for the hourly GHI for all-sky conditions among stations with points around them for HC3v5 and CRSv3.Clear-skies and cloudy-skies at stations are represented by blue, light blue, black and grey colours, respectively.

Figure 7 .
Figure 7.Comparison of rBias for the hourly GHI for all-sky conditions among stations with points around them for HC3v5 and CRSv3.Clear-skies and cloudy-skies at stations are represented by blue, light blue, black and grey colours, respectively.

Figure 9 .
Figure 9. Scatter plot between hourly GHI ground measurements and SDDs (HC3v5 and CRSv3) for Batufa station for all-sky, clear-sky and cloudy-sky conditions and the clearness index.

Figure 9 .
Figure 9. Scatter plot between hourly GHI ground measurements and SDDs (HC3v5 and CRSv3) for Batufa station for all-sky, clear-sky and cloudy-sky conditions and the clearness index.

Figure 10 .
Figure 10.Scatter plot between hourly GHI ground measurements and SDDs (HC3v5 and CRSv3) for Kalar station for all-sky, clear-sky and cloudy-sky conditions and the clearness index.

Figure 10 .
Figure 10.Scatter plot between hourly GHI ground measurements and SDDs (HC3v5 and CRSv3) for Kalar station for all-sky, clear-sky and cloudy-sky conditions and the clearness index.

Figure 11 .
Figure 11.Scatter plot between hourly GHI ground measurements and SDDs (HC3v5 and CRSv3) for Maydan station for all-sky, clear-sky and cloudy-sky conditions and the clearness index.

Figure 11 .
Figure 11.Scatter plot between hourly GHI ground measurements and SDDs (HC3v5 and CRSv3) for Maydan station for all-sky, clear-sky and cloudy-sky conditions and the clearness index.
Figures S6-S15 as in Figure S1 but for clear-sky and cloudy-sky conditions respectively.Author Contributions: B.A. conducted this research manuscript as part of his Ph.D. studies at the University of Leicester, supervised by H.B. and C.J., E.W., C.T. and M.M. are contributed to the study by writing-review & editing.The manuscript was prepared by B.A. with suggestions and corrections from all co-authors.All authors have approved the final draft.

Table 1 .
Tower stations with hourly GHI from Kipp and Zonen CMP6 Pyranometer.

Table 1 .
Tower stations with hourly GHI from Kipp and Zonen CMP6 Pyranometer.

Table 2 .
Automatic stations with hourly GHI Vaisala QMS101 Pyranometer.The periods in the table are available for the ground measurements; the SDDs for the same periods have been collected for each station location and points around the stations, which were used for validation. Note:

Table 3 .
Validation of hourly GHI under all-sky conditions for stations and points around them.Mean, bias and RMSE units are W/m 2 .

Table 3 .
Validation of hourly GHI under all-sky conditions for stations and points around them.Mean, bias and RMSE units are W/m 2 .

Table 4 .
Validation of hourly GHI under all-sky conditions for the clearness index.Mean, bias and RMSE units are W/m 2 .

Table 4 .
Validation of hourly GHI under all-sky conditions for the clearness index.Mean, bias and RMSE units are W/m 2 .

Table 5 .
Validation of hourly GHI under clear-sky and cloudy-sky conditions.Mean, bias and RMSE units are W/m 2 .