A Deep Learning Framework for Estimating Global and Diffuse Solar Irradiance Using All-Sky Images †

: Nowadays, all-sky imagers (ASI) provide valuable information regarding the sky’s state, and they have been extensively used in cloud detection, segmentation, and solar forecasting studies. In this study, global and diffuse horizontal irradiances (GHI and DHI) are modeled using a Convolu-tional Neural Network (CNN) and Red–Green–Blue (RGB) information retrieved through ASI images. The predicted GHI and DHI underestimated observations with systematic biases of − 1.8 W m − 2 and − 0.5 W m − 2 , while the dispersion errors were 82.7 W m − 2 and 39.8 W m − 2 , respectively. The correlation coefﬁcient was high, approaching 0.95 and 0.85 for GHI and DHI


Introduction
The design of solar energy projects requires long-term, up-to-date, high-quality solar radiation datasets at the finest spatiotemporal resolution.Accurate knowledge of surface solar irradiance and its components at all-sky conditions is vital for assessing the solar potential in a specific area.Clouds are significant in the climate-atmosphere continuum, modifying the incoming solar irradiance reaching the Earth's surface.The strongly varying spatiotemporal structure of the cloud field results in uncertain solar irradiance estimations.All-sky imagers are valuable tools providing continuous information regarding the sky's state.ASIs have been extensively used in cloud detection, segmentation, and solar forecasting applications, e.g., ( [1][2][3][4]).
Modeling all-sky solar irradiances is very challenging due to the uncertain behavior of clouds.Today, the Copernicus Atmospheric Monitoring Service for radiation (CAMS-Rad) has gained significant visibility among the available solar products.The CAMS-Rad service provides solar data at various temporal resolutions, easily retrieved through the Solar Radiation Data (SoDa) website (http://www.soda-pro.com/)(accessed on 1 March 2023).In CAMS-Rad, solar irradiances at cloudy atmospheres are retrieved using the concept of cloud modification factor (a function of cloud attenuation and ground reflection) [5], while the cloud properties are extracted through MSG satellite images and the APOLLO method [6].Among other models, the Fast All-sky Radiation Model for Solar applications (FARMS) [7] parametrized all-sky irradiances using the REST2 clear-sky model and look-up tables of cloud transmittances and reflectances (created at various cloud optical thicknesses, cloud particle sizes, and solar zenith angles) from the Rapid Radiation Transfer Model (RRTM).However, such modeling approaches require a significant set of parameters not always available at the spatiotemporal resolution.Alternatively, solar irradiances can be modeled using information from all-sky images and deep-learning approaches [8], showing promising results.
This work is focused on modeling the global and diffuse horizontal irradiances (GHI and DHI) using deep learning techniques and ASI images as independent information.More specifically, Red-Green-Blue (RGB) components from ASI images are imported into a Convolutional Neural Network (CNN) to estimate GHI and DHI.The results are validated using reference solar irradiance measurements.

Material and Methods
In this study, 1 min global and diffuse horizontal irradiances (GHI and DHI) were derived from the radiometric station located on the rooftop of the Laboratory of Atmospheric Physics, University of Patras, Greece (38.291 • N, 21.789 • E).The solar irradiances were measured with Kipp & Zonen CMP11 pyranometers.The manufacturer calibrated the instrument; systematic comparisons with a similar instrument showed differences within the standard uncertainty.Moreover, the commercial Mobotix Q24M model retrieved images of the sky dome, capturing images from the entire upper hemisphere every 640 µs.It stored them in 24-bit JPEG format with a spatial resolution of 1024 × 768 pixels.The sensor had a Red-Green-Blue (RGB) filter, including color intensities ranging from 0 to 255.The main objective of this study is to model GHI and DHI at all-sky conditions using RGB images.A pre-processing stage was applied to ASI images before the modeling process.More specifically, the images were cropped using an image mask to obtain only the necessary information, avoiding possible obstacles near the image's edges.Then, the cropped images were resized to 128 × 128 pixel resolution to speed up the modeling process.Finally, the RGB values were scaled to 0-1 by dividing by 255.Since solar irradiances exhibit seasonal cycles, normalized forms were calculated.The clear-sky index (CSI) is a detrended form of GHI (Equation ( 1)), defined as the ratio between the measured GHI to that at clear-sky conditions (GHIc).On the other side, a min-max normalization process was applied to DHI (Equation ( 2) where GHIc is the clear-sky GHI derived from the CAMS McClear model ((http://www.soda-pro.com/(accessed on 1 March 2023).Furthermore, only cases with a solar zenith angle (SZA) lower than 80 • were kept for subsequent analysis to avoid low-sun issues and shading effects.
A Convolutional Neural Network model was employed to model GHI and DHI using RGB images as modeling inputs.CNN has been used extensively in processing images, videos, and speech, due to its powerful feature learning ability and efficient weight-sharing strategy [9].The model architecture was similar to [8].The CNN structure was determined through a randomized splitting procedure, where 70% of the data was used to build the CNN.This subset of data was again randomly divided using a 70/30 rule for training and testing purposes.Therefore, 30% of data remaining from the initial splitting acted as an independent validation dataset.The mean absolute error (MAE) was selected as the minimization loss function.The output parameters in CNN were scaled according to Equations ( 1) and ( 2), where the minimum and maximum DHI values correspond to the training period.The RGB images were scaled to 0-1.The data scaling process ensures that all parameters are in a similar data range, and it is necessary for the efficient training of the CNN.

Results
This section discusses the modeling results against reference measurements.Figure 1 represents the modeled vs. observed GHI and DHI.In general, the modeled GHI and DHI seemed to reproduce the observations quite well.For GHI (Figure 1a), especially, the modeled values were in good agreement, with observations exhibiting a high correlation of determination value (R 2 = 0.91).The best-fit line of Figure 1a explains the underestimation of observations since the slope is lower than unity.Several points are also far from the general tendency, indicating a significant dispersion.To address these facts, the systematic and dispersion errors (Mean Bias Error, MBE, and Root Mean Square Error, RMSE) and the normalized forms (nMBE and nRMSE), using as skill reference the average observed irradiances, were further computed.A slight underestimation was found with MBE = −1.8W m −2 (nMBE = −0.32%),while the dispersion error was 82.7 W m −2 (nRMSE = 15%).
Similar results could be extracted for DHI (Figure 1b).The slope of the best-fit line equaled 0.756, substantially lower than unity, while significant dispersion was detected even by visual inspection.The respective (normalized) systematic and dispersion errors were −0.5 W m −2 (−0.39%) and 39.8 W m −2 (30%).The modeled DHI was in good agreement with observations, with R 2 = 0.74.
Figure 2 represents the frequencies of GHI and DHI differences (model -observations).Both histograms are zero-skewed, and the values are distributed around the mean (in this case, close to zero).However, the strongly peaked distributions are representative of the modeling process.The significant number of values at both tails of the statistical distributions explains the high calculated dispersion errors.The best-fit line of Figure 1a explains the underestimation of observations since the slope is lower than unity.Several points are also far from the general tendency, indicating a significant dispersion.To address these facts, the systematic and dispersion errors (Mean Bias Error, MBE, and Root Mean Square Error, RMSE) and the normalized forms (nMBE and nRMSE), using as skill reference the average observed irradiances, were further computed.A slight underestimation was found with MBE = −1.8W m −2 (nMBE = −0.32%),while the dispersion error was 82.7 W m −2 (nRMSE = 15%).
Similar results could be extracted for DHI (Figure 1b).The slope of the best-fit line equaled 0.756, substantially lower than unity, while significant dispersion was detected even by visual inspection.The respective (normalized) systematic and dispersion errors were −0.5 W m −2 (−0.39%) and 39.8 W m −2 (30%).The modeled DHI was in good agreement with observations, with R 2 = 0.74.
Figure 2 represents the frequencies of GHI and DHI differences (model-observations).Both histograms are zero-skewed, and the values are distributed around the mean (in this case, close to zero).However, the strongly peaked distributions are representative of the modeling process.The significant number of values at both tails of the statistical distributions explains the high calculated dispersion errors.The best-fit line of Figure 1a explains the underestimation of observations since the slope is lower than unity.Several points are also far from the general tendency, indicating a significant dispersion.To address these facts, the systematic and dispersion errors (Mean Bias Error, MBE, and Root Mean Square Error, RMSE) and the normalized forms (nMBE and nRMSE), using as skill reference the average observed irradiances, were further computed.A slight underestimation was found with MBE = −1.8W m −2 (nMBE = −0.32%),while the dispersion error was 82.7 W m −2 (nRMSE = 15%).
Similar results could be extracted for DHI (Figure 1b).The slope of the best-fit line equaled 0.756, substantially lower than unity, while significant dispersion was detected even by visual inspection.The respective (normalized) systematic and dispersion errors were −0.5 W m −2 (−0.39%) and 39.8 W m −2 (30%).The modeled DHI was in good agreement with observations, with R 2 = 0.74.
Figure 2 represents the frequencies of GHI and DHI differences (model -observations).Both histograms are zero-skewed, and the values are distributed around the mean (in this case, close to zero).However, the strongly peaked distributions are representative of the modeling process.The significant number of values at both tails of the statistical distributions explains the high calculated dispersion errors.It is worth mentioning here that the modeling process was designed using solely information from the ASIs.Including other exploratory variables representative of the cloud field, such as the cloud optical thickness, cloud fraction, etc., could substantially improve the results.

Conclusions
Accurate knowledge of surface solar irradiance at all-sky conditions is crucial for solarrelated applications.This study focused on modeling the global and diffuse horizontal irradiances using deep learning models (i.e., Convolutional Neural Network) and input information from all-sky images.The validation against real observations showed high correlation values, minor systematic errors, and dispersion errors of 15% and 30% for GHI and DHI, respectively.In a future step, including other exploratory variables representative of the cloud field (such as the cloud optical thickness, cloud fraction, etc.) in the model design process could substantially improve the results.

Figure 1 .
Figure 1.Scatter density plots between the observed and predicted (a) GHI and (b) DHI.Warm colors indicate higher data concentration.

Figure 1 .
Figure 1.Scatter density plots between the observed and predicted (a) GHI and (b) DHI.Warm colors indicate higher data concentration.

Figure 1 .
Figure 1.Scatter density plots between the observed and predicted (a) GHI and (b) DHI.Warm colors indicate higher data concentration.