Solar Irradiance Forecasting Based on Deep Learning Methodologies and Multi-Site Data

: The ever-growing interest in and requirement for green energy have led to an increased focus on research related to forecasting solar irradiance recently. This study aims to develop forecast models based on deep learning (DL) methodologies and multiple-site data to predict the daily solar irradiance in two locations of India based on the daily solar radiation data obtained from NASA’s POWER project repository over 36 years (1983–2019). The forecast modeling of solar irradiance data is performed for extracting and learning the symmetry latent in data patterns and relationships by the machine learning models and utilizing it to predict future solar data. The goodness of ﬁt and model performance are compared with rolling window evaluation using mean squared error, root-mean-square error and coefﬁcient of determination ( R 2 ) for evaluation. The contributions of this study can be summarized as follows: (i) time series models based on deep learning methodologies were implemented to forecast the daily solar irradiance of two locations in India in consideration of the historical data collected by NASA; (ii) the models were developed on the basis of single-location univariate data as well as multiple-location data; (iii) the accuracy, performance and reliability of the model were investigated on the basis of standard performance evaluation metrics and rolling window evaluation; (iv) the feature importance of the nearby locations with respect to forecasting target location solar irradiance was analyzed and compared based on the solar irradiance data obtained from NASA over 36 years. The results indicate that the bidirectional long short-term memory (LSTM) and attention-based LSTM models can be used for forecasting daily solar irradiance data. According to the ﬁndings, the multiple-site data with solar irradiance historical data improve upon the forecast performance of single-location univariate solar data.


Motivation of the Study
Climate changes in recent times and the high demand for electricity have led to the requirement of power generation from green and renewable sources, solar energy being one of them. Solar energy, which is an abundant sustainable energy resource, causes the least harm to the environment, turning the Sun into a major source of energy [1]. This solar power can be harnessed either through concentrated power plants or photovolatic (PV) power plants. Here, we deal with the PV power plants; their performance is mainly related to the factors of electrical parameters of its components (PV panels, inverters), characteristics of the installation (orientation, tilt angle) and meteorological conditions [2]. The meteorological factor affecting the power produced by a PV field is mainly the absorbed solar irradiance. There is, in fact, a linear correlation between the PV modules' maximum power and the solar irradiance [3]. The value of this solar irradiance being high or low depends on the geographical location and time along with the orientation of the panel that is relative to both the Sun and the sky [4]. As such, solar power tends to have a chaotic and intermittent behavior. In this study, the aim is to forecast irradiance optimally and in a generalized manner, as we face a problem similar to that of solar power forecasts [5]. The solar irradiance forecasting is performed on historical data from two locations in India for protection of the environment as well as energy security. The main aim is to achieve an increase in the amount of renewable or green energy contribution to the power generated.

Problem Statement
The PV system is required for the conversion of solar energy to electricity. However, having the environmentally friendly characteristic cannot guarantee the acceptance of PV as an alternative to conventional energy sources. The techno-economical study for viability of any PV system requires accurately estimating the energy yield with an appropriate mathematical model. Different models were proposed for the prediction of PV module performance. However, these models mostly have a complicated structure requiring detailed knowledge of the parameters which are normally unavailable in the manufacturer's data sheet. Therefore, such models are not suitable for output power calculation. It is the ambient temperature and solar irradiance meteorological data that govern the output of a PV system. Therefore, reliable temperature and radiation data should be readily available for the design of a techno-economically viable photovoltaic system. Because of the difficulties in installation, calibration, maintenance and high cost for measurement of these data, they are either not available or are only partially available at the installation site. Hence, the demand exists for the development of alternative ways to predict them [6,7]. Solar irradiance is also characteristically variable; because of this, competent strategies of forecasting are required for enabling greater penetration of solar power. Influence of location, weather and other meteorological factors also make forecasting solar irradiance a challenging task. Therefore, for successful integration of solar energy with traditional generation supplies, the ability to accurately forecast solar irradiance is essential.

Related Work
There are basically three types of forecasting techniques: numerical weather prediction, image-based prediction and statistical and machine learning (ML) methods ranging for a period of short-term, medium-term and long-term predictions. Solar irradiance data are time series data, i.e., ranging for a period of time in sequential manner, and traditionally, linear forecasting methods were widely used because they were well understood, easy to compute and provided stable forecasts. Autoregression (AR) [8], the moving-average model (MA), autoregressive with exogeous inputs (ARX) [9], autoregressive moving-average (ARMA) [10], autoregressive moving-average with exogeous inputs (ARMAX) [11], autoregressive integrated moving average (ARIMA) [12], seasonal autoregressive integrated moving average (SARIMA) [13], autoregressive integrated moving-average with exogeous inputs (ARIMAX) [14], seasonal autoregressive integrated moving-average with exogeous inputs (SARIMAX) [15] and generalized autoregressive score (GAS) [16,17] are the traditional forecast models. Belmahdi et al. [18] proposed the ARMA and ARIMA models for forecasting the global solar radiation parameter. The models showed improvement in terms of forecast error; however, only the solar radiation parameter was considered. The geographical or meteorological parameters were not employed. These are mostly linear over the previous inputs or states; hence, they are not adapted to many real-world applications. One major limitation is their pre-assumed linearity form of the data that cannot capture complex nonlinear patterns. The challenges also include lower forecast accuracy and less scalability for big data. Yagli et al. [19] performed a study using satellite-derived irradiance data from multiple locations on 68 machine learning models. The research proposed that multilayer perceptron (MLP) models were one of the best performers and, for assessment of model performance, resulted in a daily or short evaluation period as advised. Following this, the neural network models used here were focused for day-ahead forecasting. The neural network models (NN) work on nonlinear transforming layers with data not required to be necessarily stationary. Neural networks are also strongly capable of determining the complex structures in data, working as an efficient tool for reconstruction of a noisy system driven by data, which is why they are suitable for complex and variable time series forecasting. These are suited for modeling problems which require capturing dependencies and are capable of preserving knowledge as they progress through the subsequent time steps in the data. The authors of [20] proposed an approach for prediction of solar irradiance using deep recurrent neural networks with the aim of improving model complexity and enabling feature extraction of high-level features. The proposed method showed better performance than the conventional feedforward neural networks and support vector machines. The recurrent neural network (RNN) [21] architecture is a special type of neural network accounting for data node dependencies by preserving sequential information in an inner state, which allows the persistence of knowledge accrued from subsequent time steps. However, the RNN is prone to vanishing and exploding gradients. This led to the development of RNN variants such as long short-term memory (LSTM) networks [22], bidirectional LSTM and gated recurrent units (GRU) as extensions of the RNN architecture by replacing the conventional perceptron architecture with memory cell and gating mechanisms that regulate information flow across the network. These variants are widely used for the task of solar irradiance forecasting. The authors of [23] stated that LSTM is a powerful approach for time series forecasting; they used it for day-ahead prediction of solar irradiance. The study proved the LSTM model to be robust; it outperformed other forecast mechanisms such as gradient boosting regression, feedforward neural networks and the persistence model. The authors of [24] proposed a mechanism for hourly day-ahead prediction of solar irradiance using the weather forecasting data. The proposed model consisting of the LSTM variant was compared to the persistence algorithm, linear least square regression and multilayered feedforward neural networks using a backpropagation algorithm (BPNN) for solar irradiance prediction which resulted in LSTM performing the best among all of the methods. The authors of [25] developed a least absolute shrinkage and selection operator (LASSO) and LSTM integrated temporal model for solar intensity forecasting which could predict short-term solar intensity with high precision. Furthermore, recurrent neural networks can be divided into two categories based on the type of mechanism they follow, one being the traditional memory-based models and the other being the attention-based ones. Some of the memory-based models are LSTM, GRU, bidirectional RNNs and so on, while some attention-based models are the attention LSTM, self-attention generative adversarial networks and multi-headed LSTM. The memory-based RNNs are the most widely used models for the task of solar irradiance forecasting in literature. Here, we also intend to introduce the attention mechanism for the task of solar irradiance forecasting. For the task of predicting an element, the attention vector is estimated on the strength of its correlation with other elements and then the sum of these values, which are weighted by the attention vector, is taken. The attention mechanism which was originally introduced and used specifically for machine translation has been recently used for time series forecasting in solar energy tasks. The authors of [26] proposed a temporal attention mechanism for forecasting solar power production from PV plants. The authors of [27] improved upon the attention-based architecture of transformers for forecasting solar power production. In previous works, the prediction task was performed on data from a single target location. The data from the surrounding locations were not exploited for predicting future values of target location. Here, along with the target location's data, the regional data surrounding the target location were also utilized for building the model. This was done in order to exploit the available data of multiple locations and their contribution in forecasting the future value of a particular target location. A thorough study focusing on various memory-based and attention-based deep recurrent neural network mechanisms for solar irradiance forecasting has not been carried out yet. As such, the DNN-based time series models were built on the basis of the multiple-site concept to forecast the daily solar irradiance of two locations in India on data collected from NASA over a period of 36 years.

Contributions of the Study
The variable nature of solar energy poses challenges in its integration to the power grid. Accurate forecasting is required for a techno-economically viable solar energy system. The photovoltaic power data are often proprietary and not publicly available, which creates the need to utilize the satellite-derived information of solar irradiance, since the relationship between solar power and solar irradiance is quasilinear [19]. Forecasting at different time horizons has different applications for solar energy systems such as monitoring, maintenance of stability and regulation, management of scheduling and unit commitment. Hence, solar irradiance forecasting is crucial for the domain of solar energy's advancement in economic feasibility and efficient market penetration; thus, it is essential in order to pave the way for solar energy to be a major type of green energy. The study aims to develop forecast models based on deep learning (DL) methodologies and data from multiple sites to predict the daily solar irradiance. The deep neural network mechanism of machine learning was chosen since the machine-learning-based model estimates particular types of data with high accuracy in comparison to the traditional statistical mechanisms. The machine learning models are capable of extracting and learning the inherent symmetry in patterns and relationships in data. Along with the memory-based variants of RNN, the study also introduces the attention-based mechanism for forecasting solar irradiance. The machine learning models are data-driven, and a large data set is required to understand the behavior of the system, which is often complex. Hence, the past 36 years (1983-2019) of data were provided to the model. For further validation of the proposed mechanism, two locations in India along with the adjoining regional sites were considered for testing the forecast accuracy. The goodness of fit and model performance were compared with rolling window evaluation using mean squared error, root-mean-square error and coefficient of determination (R 2 ) for evaluation. To the best of our knowledge, no comprehensive investigation of solar irradiance forecast models utilizing the RNN variants and multiple-site data has been performed yet. The contributions of this study can be summarized as follows:

1.
Time series models based on deep learning methodologies were implemented to forecast the daily solar irradiance of two locations in India through consideration of the historical data collected.

2.
The models were developed on the basis of single-location univariate solar irradiance data as well as data from multiple locations. 3.
The accuracy, performance and reliability of the model were investigated on the basis of standard performance evaluation metrics and rolling window evaluation.

4.
The feature importance of the nearby locations with respect to forecasting target location solar irradiance was analyzed and compared on the basis of the solar irradiance data obtained from NASA over 36 years.
The paper is organized as follows: Section 2 highlights the materials and methods considered in this work. The data set used in this study along with the forecast framework, the methodology developed and the metrics used for performance evaluation are discussed in Section 2. The performance of the forecasting models and a discussion are presented in Section 3. The conclusion is given in Section 4.

Materials and Methods
In the sections that follow, the methodology of the proposed deep learning framework for forecasting daily solar irradiance is discussed. The outline of the proposed framework is shown in Figure 1. The first step consisted of data collection of the point target location and the multiple-site region surrounding the target location. Then, the proposed model selected the relevant location data from the multiple-site high-dimensional data. The optimal model of forecasting was then obtained utilizing deep learning models. The proposed mechanism was validated on meteorological data from two target locations for different time horizons of forecasting. The efficiency was compared with the single-location forecast model and various deep learning models in terms of performance metrics.

Data Collection
Forecast models for solar irradiance were built on historical time series data. These data can be obtained through ground-based stations or satellite-based data sets. Due to the limited availability of ground-based stations, the satellite-based data was utilized here from NASA's POWER database, which is open access with long-term coverage. The daily solar irradiance data consisting of solar radiation incidents on a horizontal surface having unit kwhr/m 2 per day were collected from the SSE-Renewable Energy Community of the POWER Data Access Viewer for a period of 36 years from 1983 to 2019.
The problem of forecasting that we are trying to address here is the forecast of data in a particular target location utilizing not only the target location data, but also the data of the region surrounding the target location. This would address the data dependency on unrelated features, since it is only the solar irradiance data that is considered for all of the sites. Instead of only relying on lags of the particular target locations data, which might consist of discrepancies, this framework considered multiple-site data of the target solar irradiance feature. The relevance of the feature is discussed in the next section. Here, we introduce a multi-site mechanism converting the problem into multivariate forecasting utilizing the concerned solar irradiance data from multiple sites. Figures 2 and 3 below show the point location and the region selected for the solar irradiance data from the POWER Regional Data Access widget which provides access to near real-time data. Point P represents the target location while the enclosed region represents the multiple sites surrounding the target location. These regional data were further analyzed and processed in the next step for the task of forecasting. For a single point-that is, the target location's data-a near real-time 1/2 × 1/2 degree data set was accessed by supplying a numeric vector with length of two, giving the decimal degree longitude and latitude in that order for data to download. For the regional coverage, a bounding box was attained for the surrounding location to a target point location with a maximum bounding box of 4.5 × 4.5 degrees of 1/2 × 1/2 degree data with 100 data points maximum in total. A numeric vector with length of four, as latitude and longitude coordinates of the lower left and upper right, was provided to attain the enclosed area. These coordinates of the data set utilized for the case study are represented in Table 1. Table 2 lists the descriptive statistics of the target solar irradiance data.

Data Selection
The data of 36 years were collected for the target location as well as the enclosed region. As it can observed from Table 1, the data collected for forecasting Location 1 and 2 consisted of 12 and 15 sites, respectively. However, the sites surrounding the target location may not all have been correlated and helpful in forecasting solar irradiance data. The dimensions of the features needed to be reduced to utilize only the relevant features for forecast purpose. Here, feature implies the solar irradiance data of all of the locations as depicted in Table 1. Therefore, the forecast framework further consisted of analyzing the correlation and feature importance of the multiple-site data for forecasting the target location data. This task was accomplished by utilizing the Pearson, Spearman and XGBoost Ranking as discussed below.

Pearson Correlation
Pearson correlation measures the linear relationship between related variables. −1 implies the variables are negatively correlated, 0 denotes they are not correlated and 1 means they are perfectly correlated. It generally measures the global synchrony.

Spearman Correlation
Spearman correlation is a correlation test which is nonparametric in nature. It does not carry assumptions regarding the distribution of the data and is the appropriate correlation analysis when the variables are measured on a scale. Equation (2) denotes the formula for calculation of Spearman correlation with n number of observations and d i being the difference between ranks of the corresponding variables.

XGBoost
Along with the correlations between target and neighboring locations, we also computed the feature importance of the solar irradiance values. eXtreme Gradient Boosting (XGBoost) utilizes information gain for estimating the importance of feature. After the boosted trees are constructed, the importance scores for each attribute can be retrieved in a straightforward manner. This importance is calculated for all of the features which ranks and compares them to each other. Calculation of feature importance is performed for a single decision tree. This is done by the amount in which each split point contributes to the improvement of the performance measure, weighted by the number of observations the node is responsible for. The performance is measured using an error function and feature importance calculated by the average across all of the decision trees within the model.
These scores would indicate the relation of the multiple sites and the target location, such that only the relevant sites are chosen which would improve the forecast accuracy. Figure 4 depicts the process of selection of m locations from the collected n sites data. The solar irradiance of these m locations and the target location were used for the forecast methodology, utilizing deep neural networks as stated in the next section.

Forecast Methodology
This section introduces the deep learning methodology of recurrent neural networks and their variants. The vanilla LSTM, GRU, bidirectional LSTM, CNN LSTM and the attention mechanisms are presented below. The networks took the past solar irradiance data of the target location and the selected multi-site locations as input features. The final output y (t−1+∆) was the forecast result of future solar data of the target location. Given daily solar irradiance time series X, with x i representing observed value at time i, X = {x 1 , x 2 , ...., x (t−1) }, the problem was forecasting x (t−1+∆) with ∆ as the horizon w.r.t. different tasks. y (t−1+∆) is the prediction with y (t−1+∆) = x (t−1+∆) being the ground-truth value. For every task, {x (t−w) , x (t−w+1) , . . . , x (t−1) } is used to predict x (t−1+∆) , where w is the window size because of the assumption that there exists no useful information before the window and the input is fixed. Equation (3) denotes the problem with y representing ∆ days-ahead forecasting of solar irradiance data with deep neural network model f on past historical data. Since data from multiple locations were used, the input data x consisted of time-lagged values of target location data as well as the regional multi-site data.
The solar irradiance forecast modeling in the univariate scenario for a single location was performed by taking the historical data at time t − 1 + ∆, denoted by Irr (t−1+∆) in Figure 5, as the target variable and the window-lagged data, Irr (t−w) , Irr (t−w+1) , . . . Irr (t−1) , as input variables. w represents the window length for the lagged time series, which was decided upon using the autocorrelation and partial autocorrelation characteristics of the data. L i represents the ith hidden layer L, in which the i values are set during model tuning. Overall, the future values of solar irradiance were predicted using the past and present values according to set window size. In the proposed forecast framework, the difference was in the features utilized for future value prediction. Instead of only depending upon the past lagged values of one location, the values of multiple locations were utilized. This converted the univariate form of forecasting into a multivariate form with solar irradiance data from the target location as well as m locations as features. The final aim was forecasting solar irradiance values for the target location. Figure 6 represents this framework, with Irr j representing jth location data, where j ranges from 1 to m and Irr target denoting solar irradiance for target location. The deep neural network assigns weights to the past input data in order to predict the future values. The recurrent neural network variant of DNN is used for the task of time series forecasting since it is suited for sequential data and remembers the temporal dependencies in data. Equation (4) represents a vanilla RNN which takes into account the present input along with the output of the previous state. The proposed framework utilizes the enhanced cell mechanism of this vanilla RNN which is capable of understanding more complex and long-term dependencies.

LSTM
LSTM belongs to the family of recurrent neural networks, improving on the efficiency of traditional sequence learning mechanisms. The problem of vanishing and exploding gradients persists in the RNN, which led to the development of LSTM in order to overcome these problems of RNNs.
LSTM, as shown in Figure 7 [28], introduces additional computation components to the RNN, the input gate, the forget gate and the output gate. The equations for the forward pass are stated below: The current input and the previous state are processed by A t after which the input gate I t decides upon the parts of A t to be added in the long-term state C t . F t is the forget gate with the responsibility of deciding which parts of C t−1 are to be erased, discarding the unnecessary parts. The output gate O t finds the parts of C t to be read and shown as output. As such, there exists a short-term state H t that is shared between the cells and a long-term state C t in which the memories are dropped and added by the respective gates. Weight updation is carried out by following equation below with * denoting any one among {cur,inp,for,out} and < * , * > denoting the product.:

Bidirectional LSTM
LSTM processes the inputs in strict temporal order. This indicates that the current input has context of previous inputs, but not the future. The bidirectional LSTM [29] model was introduced to address this shortcoming. It duplicates the LSTM processing chain so that the inputs are processed in both forward and reverse time order. This allows the network to look into the future context as well.

GRU
The gated recurrent unit [30] can be viewed as the simplified version of LSTM in which the cell states are not used explicitly. The main simplification made is that both the state vectors are merged into a single vector. Figure 8 [31] represents a GRU cell, and the equations followed during forward pass are shown below.
There exists a single gate controller which controls the input gate and the forget gate. On a gate controller giving an output of 1, the input gate is opened while the forget gate is closed, and vice versa when the output is 0. This implies that on requirement of a memory to be stored, the location to be stored is erased first. It is, in fact, a variant of LSTM which is used frequently. No output gate exists and the full state vector is output every time step. However, there is a new gate controller which controls the part of the previous state which will be shown to the main layer.

CNN LSTM
CNN [32] architecture consists of three types of layers: the convolutional layer, the pooling layer and the fully connected layer. Convolutional layers take in the feature maps as inputs from the previous layer and perform convolution operations between filters and the inputs.
where w i are the convolutional kernel parameters, v i the output of previous layer and (x, y) the spatial coordinate. A complete feature map is obtained as follows with b as scalar bias and g as the nonlinear activation function.
z(x, y) = g(conv(x, y) + b) The hybrid CNN LSTM method consists of a series of connections between the convolutional and LSTM layers. The convolution operation reduces the number of parameters and uses a pooling layer for combining the output of a cluster of neurons to a single neuron. The pooling layer also reduces parameters and computation cost of the network. Max pooling is used here which selects the maximum value from neuron clusters. The LSTM layers are placed after the CNN layers. The past and future contexts are kept in view along with consolidation of memory units and cell states for the temporal dependencies. The vanishing and exploding gradients are also addressed by the LSTM layers. The dropout regularization is used for overfitting as well.

Attention LSTM
The attention mechanism belongs to the sequence-to-sequence model which was built mainly for neural machine translation. It consists of an encoder and decoder with encoder encoding the input to a fixed length vector and decoder translating it [33]. The attention mechanism addresses the long-term dependencies in which past values from far back might be affecting the present day forecast. The identification of relevant features and dynamic interdependencies can be done through attention. The attention mechanisms mainly differ in the architecture of the encoder-decoder adopted and the score function. In the sequence-to-sequence model, the decoder receives the last encoder hidden state from the encoder, a vector representation, much like an input sequence's numerical summary. Thus, for a long input, the decoder uses just this one vector representation to output the prediction, which leads to forgetting. As such, attention was introduced, which acts as an interface in between the encoder and decoder providing information from every encoder hidden state to the decoder. As shown in Figure 9 [34], this enables the model to selectively focus on useful parts of the input sequence based on the scoring function, and thus learn the alignment between them. Here, score represents a typical scoring function which represents the relevance between the input vectors. The whole process is done for computation of the context vector which is then forwarded to the decoder layer [26].
where x is the given input, h t-1 is the hidden state and s t-1 the cell state. W a , W x and b a are the attention weights and bias. As it can be observed from Figure 9, after the input layer, there exists an encoder layer which processes the input and then forwards it to the attention layer. The σ represents the softmax function performed on the score. This attention layer proceeds with the input according to the equations as stated above and then feeds its output to the decoder to be sent to the output layer. Now, these encoder and decoder layers are nothing but the DNN hidden nodes and layers which perform a typical sequential learning from the complete process. The additional benefit is the computation of relevant information after extracting it through the scoring function. This aides in managing long-term dependencies by just the introduction of an attention layer.
In the attention LSTM developed for solar irradiance forecasting, the score function adopted is the content-based attention. The attention vectors in content-based attention [35] or cosine scoring are created on the basis of similarity between the key and memory rows. It computes the cosine similarity which is then normalized by the softmax function.
The encoder layer consists of the bidirectional LSTM and vanilla LSTM. The hidden outputs from this layer is then fed to the attention layer which computes the score function and then the context vector. This is then forwarded to the decoding fully connected layer or the dense layer which then gives the output that is the forecast values of solar irradiance.

Performance Evaluation Metrics
In order to verify the performance of forecasting models, the goodness of fit needs to be measured. The common metrics which are used to calculate this include MSE, MAE, RMSE, SSE, R 2 score, etc. Here, we use the following metrics as in Table 3. E denotes the expected value or the actual value of the target output and F denotes the output of the forecast model given input X and weight w. MSE and RMSE provide an insight regarding the error. Low values of MSE and RMSE denote better performance. R 2 is the coefficient of determination indicating closeness of fit with baseline model. When the R 2 score value tends to 1, the relationship between the predictors and response variable is considered to be strong, whereas an R 2 score close to 0 indicate the opposite. Table 3. Forecast performance evaluation metrics.

Results and Discussion
The results of the simulation experiments performed on historical data are presented in this section. The analysis was performed based on test results, which were further analyzed for their diversity, robustness, importance of the features used and the significance of multiple horizons. These results are discussed and represented in the subsections that follow.

Test Results
The results of the proposed multi-site deep learning forecast methodology were compared with those of the traditional deep learning forecast mechanism for the two locations in India. Tables 4-9 show the performance of the forecasting models in terms of MSE, RMSE and coefficient of determination. The best performing models are shown in bold. For suitability of representation, the MSE values were multiplied by 10 3 and the RMSE values were multiplied by 10 2 .
Tables 4-6 indicate the performance metrics of each developed model for Location 1 with different horizons. The developed forecast methodology had the lowest MSE and RMSE and the highest R 2 score in all of the cases. The bidirectional LSTM performed the best for 1-day-ahead forecasting. However, the attention LSTM proved its superiority in 4-days-ahead and 10-days-ahead forecasting of solar irradiance. The single-location forecast models also performed well; however, the developed multi-location model performed better, indicating its ability to accurately forecast solar irradiance data. Similar results were observed for Location 2 as represented in Tables 7-9. The proposed methodology performed the best among the models compared. The bidirectional LSTM performed the best for forecasting 1-day-ahead and 4-days-ahead solar irradiance, while the attention LSTM outperformed others in forecasting 10-days-ahead data. It was also observed that Location 2 showed the lowest errors, followed by the higher error values of Location 1. The performance of the deep learning forecast methodology was improved by the addition of multiple-site solar irradiance feature.  The tables indicate that the proposed methodology outperformed other models on all data sets, metrics and horizons. The superiority of the multi-site deep learning forecast methodologies was demonstrated by the results obtained through simulations. Furthermore, the bidirectional LSTM and attention-based LSTM models performed the best among other DL models. The bidirectional LSTM models showed their capability of exploiting data from temporal contexts while attention LSTM utilized complex and nonlinear interdependencies between time steps and time series for predicting future values of solar irradiance data. Both of the models showed consistent performance of lower MSE and RMSE values and higher coefficients of determination, with the attention-based models performing the best in longer horizons with more complex characteristics. It can be established that the attention LSTM, based on the content-based scoring function and proposed multi-site data, is indeed an enhancement to the previously developed forecast models.

Analysis of Diversity and Robustness
As stated by [36], forecasting performed from a single origin tends to be prone to corruption because of occurrences which are unique to that origin. It has also been rightly said that the performance on data outside that used in its construction remains the touchstone for its utility in all applications. Predictive machine learning includes the routine application of repeated subsampling of the data set on which some algorithm is parameterized, which in turn leads to diversity in data rather than in the algorithm [37]. This technique also assesses how well our algorithm would perform in the case of unseen or independent data sets. In the performance estimation model considered here, the model is updated only with new data. Past trained data is not considered from the origin for training again as shown in Figure 10. The forecast operation was performed not only for 1-day-ahead solar irradiance but also for multiple horizons as indicated in Figure 11. The historical daily solar irradiance data were utilized to forecast 1-day-ahead, 4-days-ahead and 10-days-ahead target data. This also proved the robustness of the proposed methodology for diverse horizons. Tables 4-9 in the previous subsection indicate the results of this performance estimation model for multiple horizons. The error metrics indicate the suitability and reliability of the proposed method for accurately forecasting solar irradiance data.

Feature Importance Analysis
Selection of features is an important task for DNN models for eliminating features which are not important in forecasting target data and for reducing computational time and complexity. The proposed methodology utilized solar irradiance data from multiple locations for forecasting target solar data. As such, for efficiently selecting the most influential input features, the importance of each variable was determined. This data selection was done by the analysis that was carried out for selection of particular regional point locations from multiple points surrounding the target location. Tables 10 and 11 show the feature importance on the basis of Pearson, Spearman and XGBoost scoring for both of the locations, respectively. The feature importance was calculated for site IDs 1, 2, 4, 5 and 8 for Location 1, while for Location 2, the sites with IDs 1, 4, 11, 12 and 14 were analyzed. The sites for both of the locations were different and did not overlap with each other. The values of the scores were multiplied by 10 2 for ease of representation. For Location 1, the sites corresponding to number 2, 5 and 8 out of the five sites showed the greatest correlation. In the case of Location 2, site numbers 1, 4 and 12 had the greatest importance out of all five sites. As such, only the most important sites' solar irradiance data were selected under data selection.

Comparison for Different Horizons
The results also showed the performance of the models for different horizons. As the length of the horizon increased, the performance of the model differed in a similar fashion in all of the cases. The change in performance of the proposed model for both of the locations was observed as the horizon increased. The performance metrics showed similar trends with better performance in shorter horizons. Figures A1 and A2 in Appendix A represents this change for both the locations in terms of MSE and R 2 score as the forecast horizon increases. For a horizon length of 1, the models tended to give the best performance metrics. The proposed model again performed the best as compared to the other models. A similar trend was seen for all of the horizon lengths.

Conclusions
Solar irradiance forecast has captured the attention of current research due to the requirement and interest in renewable and green energy. Accurate forecasting of solar irradiance is required to understand the solar energy perspective of a region, considering the opportunities as well as challenges related to forecasting. The DNN models can efficiently and accurately predict daily solar irradiance data. In this work, the DNN models were used to predict the daily solar irradiance data with multiple sites data of solar irradiance from two locations in India. A historical data set of solar irradiance over the past 36 years was used for training and testing to accurately forecast solar irradiance in this study. For checking the validation and stability of the simulation results, the goodness of fit of the model was tested using MSE, RMSE and coefficient of determination. The results demonstrated the capability of the proposed methodology in providing accurate daily prediction of solar irradiance. The coefficient of determination (R 2 ) was equal to 70% and 73% for both the locations, respectively. The R 2 scores greater than 50 indicated excellent forecast performance of the model. Moreover, the feature importance was analyzed utilizing correlation and XGBoost scores. The results supported the selection of the multi-site data and its goodness of fit. In addition, a comparison of the DNN models on the basis of multiple horizons was also conducted. The results showed that forecasting tasks of shorter horizons shows better accuracy while longer horizons require more complex models.
The present study also exhibited certain limitations. The black box nature of machine learning models make understanding the model difficult. Our future work will concentrate on exploiting the hybrid models consisting of linear and nonlinear models. The limitations of using single models in processing data patterns and the nonstationary behavior of solar irradiance and meteorological parameters in various atmospheric conditions have led to the introduction of hybrid approaches to achieve more accurate results for modeling and forecasting [38,39], which led to enhancement of model interpretability and accuracy. The study considered the data of two locations from a single country. Future research would include target data from multiple locations and different climate zones. In general, promising models with boosted forecast precision can estimate potential solar energy-in particular locations and advance the sustainable planning of solar power applications.