Very Short-Term Power Forecasting of High Concentrator Photovoltaic Power Facility by Implementing Artiﬁcial Neural Network

: Concentrator photovoltaic (CPV) is used to obtain cheaper and more stable renewable energy. Methods which predict the energy production of a power system under speciﬁc circumstances are highly important to reach the goal of using this system as a part of a bigger one or of making it integrated with the grid. In this paper, the development of a model to predict the energy of a High CPV (HCPV) system using an Artiﬁcial Neural Network (ANN) is described. This system is located at the University of Rabat. The performed experiments show a quick prediction with encouraging results for a very short-term prediction horizon, considering the small amount of data available. These conclusions are based on the processes of obtaining the ANN models and detailed discussion of the results, which have been validated using real data.


Introduction
The current challenge of carbon emissions reduction related to fossil-fuel electricity and its effects on the environment joined with the notable financial challenge have favoured the advancement of green energies such as solar photovoltaic (PV). As one of the best-known green energy resources, expectations in PV technology are very high due to its ability to curb dependence on fossil fuels, specifically, concentrator PV (CPV) [1] where the solar beam radiation is concentrated onto small but highly efficient multi-junction (MJ) solar cells, by means of cheap optical devices like curved mirrors and lenses. A typical CPV system includes a solar tracker and cooling system. The solar tracker keeps the modules aligned to the solar rays during the day. As for the cooling system, that is located at the back of the modules; in addition to bearing the solar cells, it allows dissipating the heat caused by the concentration [2]. The intermittent power production is one of the main weaknesses regarding the development of PV systems. Energy suppliers and users can face operation and control issues if power production is not stable and not coupling with demands. In the energy market, in both development and integration of every energy production system, the energy forecast is essential since possible increases or drops of power production can be estimated in advance. This may give the energy supplier the ability to regulate their services in order to increase cost saving and ensure a continuous flow of electricity supply [3]. Thus, it is possible to forecast the output power of a PV system either directly or indirectly [4,5]. Whenever solar radiation is forecasted as an intermediate step by using ambient temperature, relative humidity, wind speed, wind direction and clearness index before reaching the PV power forecast, then we talk about indirect forecasting [6][7][8]. As for direct forecasting, empirical equations or machine learning algorithms can be exploited and it is more accurate than indirect forecasting. In literature, PV power forecasting can be classified based on the forecasting horizon, historical data of solar irradiance and other meteorological data patterns, and techniques used for the forecasting. The forecast horizon is the span of time into the future for which the PV power outputs are to be forecasted. The forecast time horizon should be considered before designing the proper forecast model [9]. The classification of power generator forecasting based on the time horizon and their different applications can be found in Table 1. This classification is not a standard since some authors have only tree groups by defining short-term forecast as done for less than 7 days. As presented by Table 1, the short-term forecast is designed to ensure unit commitment, scheduling and dispatching of electrical power. It is recommended while designing PV integrated energy management system and it enhances grid operation security. The very short-term forecast is more useful for control and adjustment in the system during operation. A complete review of the PV power forecasting can be found in J. Antonanzas et al. [10]. CPVs considers a modern technology [11], and thus, reliable forecasting is necessary in order to accelerate the propagation of this technology in high Direct Normal Irradiance (DNI) areas worldwide. However, CPV forecasting is challenging compared to flat-plate systems. Here are some of the reasons which have contributed to this are shown to this:

•
The output power of a CPV system is mainly DNI dependent. The solar irradiance component is influenced by clouds and aerosols making it variable and difficult to predict compared to the global radiation [12,13].

•
The input spectral distribution is modified because of the optics used for the light concentration. Therefore, the system becomes angular dependent [14,15].

•
Multi-Junction solar cells are mostly used in CPV systems. The spectral distribution of the concentrated sunlight influences the temperature and the current matching ratio between the sub-cells which will then affect these devices' electrical output [16][17][18].

•
It is challenging to measure the cell temperature once the device is settled in the CPV assembly. Even the cell temperature prediction is challenging since the cell can no more be reached because of other surrounding components of the CPV module [19,20].

•
The outdoor performance of CPV systems is also impacted by lens temperature, pointing errors and soiling [8,21].
The classification of the forecast model based on available data can be either persistent method based, a statistical model or a hybrid model as described in the chart hereafter, there shown in Figure 1. Artificial intelligence (AI) and machine learning models can recognise the dynamics of any system with no previous knowledge of the interactions between its components, thus it has been used in PV applications [23]. Artificial Neural Network (ANN) are used in broad and varied solar power and energy systems for the purpose of modelling these systems, in production [24,25] or/and demand-side [26]. Feed-Forward Neural Network (FFNN) or Radial Basis Functions Neural Network (RBFNN) are involved in typical applications [27]. In order to choose the optimal set of inputs for the ANN to optimise a cost function, a combination of ANN with other artificial intelligence techniques can be used, such as Particle Swarm Optimisation (PSO), Genetic Swarm Optimisation (GSO) [28], fuzzy logic [29], Genetic Algorithms (GA) [30], stepwise regression [31], Principal Component Analysis (PCA) [32], or firefly optimisation [33].
In CPV field, the complexity of making a good model for the system makes ANN suitable. Thus, some authors had developed ANN-based models either to estimate environmental data or to model and predict the electrical output of the CPV device. For the estimation of the DNI, Lopez et al. [34] reported a Bayesian ANN where they used the air mass and the clearness index as inputs. Chu et al. [35] came over the challenge of DNI modelling by using a Feed-Forward Neural Network (FFNN) combined with GA. Data was based on time-series from measured DNI and cloud coverage. For the same purpose, J. Mubiru et al. [36] have developed an FFNN. They feed their network with monthly average daily DNI, the maximum temperature, sunshine hours, the geographical coordinates (longitude and latitude) and the location height. On the other hand, by using the clearness index K t , the declination and hour angles and the global normal irradiance as inputs, Renno et al. [37] developed an ANN model that predicts the hourly DNI. Other researchers used FFNN to deal with the issue of the forecasting the DNI [38,39]. Some researchers tried to forecast both the DNI and the Diffuse Irradiance (DHI) using either FFNN [40][41][42] or RBFNN [43]. Moreover, deep learning has been used for very short-term forecasting, Ospina et al. [44] used Long Short Term Memory (LSTM) neural networks to predict the power of a PV plant for an interval of 30 min. Available weather data and PV power time series have been used to obtain the model, and Liu et al. [45] used LSTM and Discrete Wavelet Transform (DWT) to predict wind power changes in very short-term (15 minutes), the DWT was used to obtain sub-signals from the original data (wind power) and used independently from LSTM, then LSTM was used for the forecasting.
In regards to the modelling of the cell temperature in CPV technology, some researchers used different mathematical equations to solve the issue [19,46,47] but E.F. Fernández et al. in [48] proposed an FFNN with Levenberg-Marquardt back-propagation algorithm to train the network in order to calculate the cell temperature of a High CPV (HCPV) module from easily obtainable atmospheric data as inputs, namely the DNI, air temperature (T air ), and wind speed (W s ).
In CPV applications, earlier researchers have devoted non-negligible efforts in developing empirical models based on outdoor measurements for the prediction of the output power of an HCPV module [49,50]. Nevertheless, to implement these methods is difficult due to several reasons such as: (i) It requires a complex, accurate and sometimes highly expensive devices to implement the tests and, (ii) the requirements of some MJ parameters that are utilised in the HCPV construction [2]. Some of those models need some parameters such as the Z parameter related to intrinsic information on the MJ cell [51]. In order to avoid those complications, some of the authors have used ANN for this purpose. In fact, F. Almonacid et al. [20] have reported a model based on ANN that predicts the maximum power of an HCPV module, in outdoor conditions. They used a few external inputs: Air Mass (AM), Precipitable Water (PW), T air , W s , and the DNI. The ANN was trained with the Levenberg-Marquart back-propagation algorithm, which is known to find only the local minimum. Rivera et al. [52] have developed a CO2RBFN, a cooperative-competitive based model for the calculation of the maximum power. This model accounts for the T air , W s , the Average Photon Energy (APE), and the DNI. Therefore, a spectroradiometer is requested for the measurement of the APE. In all the methods mentioned above, only a single module has been considered. A detailed review of ANN applications and their uses for CPV modelling can be found in [53].
Additionally to all the works listed in the previous paragraphs, RBFNN models can be used to predict the maximum power of a CPV system due to their ability to obtain such prediction model without the need for previous knowable of the CPV system's details but with a simple knowledge of the variables and their effect on the system output. In this way, an ANN can be considered as a black-box model [54,55]. In fact, its use in this way is increasing especially for complicated systems such as CPV.
In this work, a prediction model for HCPV system has been developed as a black-box model using RBFNN for very short-term power forecasting purposes and, later, it is compared to real data saved during the operation of this system. Specifically, the HCPV system is located at the campus of the International University of Rabat, in the middle-west of Morocco. The obtained results from the RBFNN show a good accuracy at the time to capture the behaviour of the HCPV plant. Very short-term PV power forecasting (few seconds and up to one hour) is done for power smoothing, real-time electricity dispatch, and optimal reserves. The very short-term forecasting is useful to control smart inverters, which lower ramp-events that can damage the grid. Ramp-events are highly taxed in some electricity markets since they reduce the profitability of the system [56][57][58].
The rest of the work is structured as follows: The next section explains the methods used in brief and presents the proposal for the available data in this work. In Section 3, the results obtained will be presented and analysed and, in the last section, the main conclusions and future works will be listed.

The HCPV Facility
The HCPV system, in which this work has been carried on, consists of three strings of 36 modules, connected in series. This system is located at the campus of the International University of Rabat in Morocco, and its geographical coordinates are latitude 33.982 • N, longitude 6.7248 • W. Figure 2 shows the HCPV plant. Each small square is a primary optic (lens) and 6 lenses counted horizontally close the same CPV module. MJ cells are located 40 cm behind those points with every cell on its own heat sink. The reflecting points are the centres of the PolyMethyl MetaAcrylate (PMMA) Fresnel lenses. The secondary optics are placed above the cells so that the sun's rays must pass through them before reaching the cells. The tracking system is a double-axis system, driven by two motors at the rear. Table 2 shows the technical characteristics of the modules which are provided by the manufacturer, where the geometrical concentration is the ratio of the lens area and the cell surface, and the concentration is the geometrical concentration multiplied by the lens efficiency (or the transmittance). To measure the DNI, a pyrheliometer has been settled on the solar tracker. In addition, a nearby weather station has been used to get wind speed, air temperature, relative humidity and wind direction. Finally, a software was installed on the tracker in order to estimate the solar elevation (h).

Artificial Neural Networks
The ANN is resembled the brain's biological neural network by solving problems in a similar process. They work like a black-box model which connects the output to the input, by fully connected neurons or nodes. More specifically, these nodes are fully connected to the inputs and output by weights. These weights are calculated by a given algorithm. ANN are used to model a system, to identify patterns, or to have a non-linear mapping between the input and output vectors. A small amount of system knowledge is required. Therefore, ANN could be very useful in complicated modelling, when supervised training methods are applied, since the weights, biases (ANN Parameters) and the structure of the ANN can be learned from the data.
One of the less complicated ANN is the RBF. The RBF consists of three layers: The input, the output and the hidden layers. These layers are fully connected. A node is assigned to each one of the input variables. Then the inputs signals pass without weights to the hidden layer. The hidden layer contains transfer functions, also called the RBF.
An RBF has a centre position and a radius, or 'centre' and 'width' in an equivalent way to Gaussian function. The highest output is given when the input variables are close to the centre position. On the other hand, the function decreases monotonically when the distance from the centre increases. The decreasing speed of the RBF function is defined by the radius. Specifically, when the radius is small, the decrease will be quick. On the other hand, it will be slow when the radius is big. The Gaussian function is commonly used to activate the neurons (n) of the hidden layer, which is well known as a radially-symmetric function, see Equation (1): In the previous expression, C i is the centre of i th RBF unit, X r is the input, σ i is the radius (width), and n is the number of nodes in the hidden layer.
The output of an RBF is the summation of all the weighted outputs of the units which compose the hidden layer with the bias term added to the output node, as it is shown in Equation (2): where w ik is the weight of the node that connects the ith RBF unit in the hidden layer to the kth output, and m is the number of outputs in the ANN. RBF is suitable for short-term photovoltaic power prediction because it is able to solve non-linear problems, due to its training process, which contains unsupervised learning in the hidden layer combined with supervised learning in the output layer.
A gradient-based algorithm has been used in RBF training, which minimises the training error. Data has been divided into three different datasets: (i) Training, (ii) generalisation and (iii) testing. The training process is done using the training dataset. It will be terminated when the minimum error is obtained using the generalisation dataset, composed of unseen data during the training process. This scheme is used to solve the problem known as over-training. The third dataset (testing dataset) is used after that to compare different trained models with different model structures, this dataset is not used during the training process [59,60].

Proposed Prediction Model for Very-Short-Term Power Prediction
The data available consists of 92 days from 2016 to 2018, with a sampling time of 1 min. As an example, part of those data is shown in Figure 3. More in detail, the top graph shows the DNI, the second graph displays the air temperature, the third graph depicts the wind speed and, finally, the bottom graph shows the solar elevation.
The data have been filtered from noise and measurement errors, with final number of 29,960 samples. As is pointed out before, three datasets have been created: (i) Training, (ii) generalisation and, (iii) testing. The data into these three datasets have been chosen randomly, as 35% for training, 35% for the generalisation, and the other 30% for testing.  All missing data that appear as blank spaces in Figure 3 have been removed. Moreover, since the input points are taken randomly and not in their time order this deleting will not affect the simulation.
In this work, environmental data such as: Wind direction (W d ), T air , AM based on Solar Elevation (measured), W s , Azimuth angle of the tracker and the DNI have been used as inputs among with one and two lags of the output power of the HCPV system which are the power signal delayed one and two in time. Then, several numbers of nodes, i.e., 9,12,15,21, and 24 have been tested in the hidden layer with the aim to train different models, while the output was the power of the HCPV. The structure of the RBF used in this work is shown in Figure 4. This figure shows the black box ANN with all its external inputs and the feedback after one-and two-time delays (one and two lags) from the output power.

Results and Analysis
The forecasting accuracy of the PV power generation is a key factor for ensuring grid stability and promoting PV installation. Thus, an accurate measurement of the PV power forecasting model is important in the forecasting process.
Several simulations have been tested, the simulation, which will be presented in this paper, takes into account 1-min sampling time. All the models were trained using the Root Mean Square Error (RMSE) index. From those models, the best five ones are selected using the least ratio of RMSE of one step ahead to Root Mean Square (RMS) of the ouput power, that is, RMSE1/RMS(P). The RMSE formula and the RMS formula for a model X are shown in Equations (3) and (4), respectively.
Standard performance measures is necessary to evaluate prediction models and compare them. Therefore, various evaluation methods have been used to evaluate the accuracy of the forecasting models of the PV power [61]. Hence, the Mean Bias Error (MBE) [62], mean absolute error (MAE) [30], mean absolute percentage error (MAPE) [63], mean square error (MSE) [64] and RMSE [65], which formula has been showed before, have been commonly used in PV power prediction model accuracy evaluation. The formulas of these indexes are shown below: where, in the previous equations, E is the error or difference between the power prediction (P prediction ) and the real power measured (P measured ). The MBE helps to know whether the model over-or underestimates the power, the accuracy of the forecast compared with measurements can be clear by using MAE, as it calculates the average error between these two, whereas, MAPE is more useful to compare several different forecasts with different time series. On the other hand, the MSE importance in statistical modelling and RMSE provides quick insight into the variance and standard deviation of the errors, make them widely used in academic works. Nevertheless, their applicability is limited because of their dependency on the scale. Moreover, they are more sensitive to outliers than MAE due to the squared error.
On the other hand, the performance results of the best five models according to the ratio RMSE1/RMS(P) are shown in Table 3 together with the number of nodes in their hidden layer. In Table 3, RMSE1 is the RMS index for the 1 step ahead error, and RMSE1/RMS(P) is the ratio of the RMSE for 1 step ahead to the RMS of the power signal (Output), whereas RMSE15 and RMSE15/RMS(P) have the same meaning but for 15 steps ahead.
It is possible to see in Table 3 that, the best five models have an error between 9.1 and 9.2% for 1 step ahead prediction whereas for 15 steps ahead prediction the errors are in the range of 22% for the best model and 26% for the worst one. It is important to note that, although up to 24 nodes for the hidden layer have been tested, the best results for the cases of 1 and 15 steps ahead were for 18 and 21 nodes, and only one of the best five models has 24 nodes in the hidden layers.
The top graph of Figure 5 shows the prediction of the best five models for one step ahead, these predictions are calculated for one week of data of the testing dataset. The results show a good estimation since the predicted power captures the behaviour of the measured power with promising accuracy, for this reason, all the signals are overlapping in Figure 5a Even when a zoom is done for one day, specifically for the third day, see the bottom graph of Figure 5, is possible to see the good accuracy of the five models at time to reproduce the behaviour of the real system. However, it is important to highlight that, the models are more accurate during sunny days than in cloudy ones. On the other hand, in Figure 6 is possible to see the ratio between RMSE and RMS(P) for several steps ahead predictions. As expected, the error proportionally increases with the step ahead predictions, from 9% for 1 step ahead to 22% for 15 steps ahead predictions.
As the same than in Figure 5, the predicted power for one week of the best model for 5, 10, and 15 steps ahead together with 1 step ahead can be seen in the top graph of Figure 7 in comparison with the measured power, whereas in the bottom graph of Figure 7 is shown a zoom of the third day. It is important to note that, when a large step ahead is used, the prediction line still captures the behaviour of the measured power but with less accuracy and has a dramatic overshoot at some points when power fluctuates, this fact can be easily seen in the zoom of the third day, bottom graph of Figure 7. Nevertheless, models seems to have a promising results.  In order to have a wider view on the results, different evaluations for the error value of the best five models are shown in Table 4 Table 4 for 1 and for 15 steps ahead. The positive values of MBE show that the prediction of both 1 and 15 steps ahead have a characteristic of over-forecast. From these results is possible to infer that, as expected, the results of the forecasting using 15 steps ahead is worse than using 1 step ahead.  This degradation in the 15 steps ahead prediction results could be due to two factors. Firstly, the forecasting depends on the steps before it, thus, if these previous steps are not good enough the forecasting will be degraded. Secondly, as is mentioned before, the CPV system uses the DNI as input instead of indirect irradiation, as is well known, cloud transients block the DNI, thus, different cloud speeds could increase the error percentage and, this fact, is more noticeable with a larger step ahead prediction, i.e., with 15 steps ahead short cloud transients, which has a duration shorter than the prediction, can affect the system output and will not be detected for the model.
The MAPE index provides information on the short-term performance. It stands for the measure of the variation of predicted values around the measured data [25]. It is often useful in practice because of the very intuitive interpretation as a relative error.
From Table 4 can be noticed that the MAPE is in the range between 10.4% and 11.4%, and between 28.4% and 30.8% for the best 1 step ahead models and 15 steps ahead ones, respectively. On the other hand, MAE index is in the range of between 0.0245 and 0.0252 for 1 step ahead and between 0.1252 and 0.1330 for 15 steps ahead. These results show that the developing of RBFNN is suitable for the energy prediction in short time horizons since the model results will be degraded if the prediction horizon is increased.

Conclusions and Future Works
In this work, an RBFNN has been calculated with the aim to predict the behaviour of a new type of CPV. The predictions of the power produced by the HCPV facility have shown promising results with a simple structure. The fact that developing an RBF model is very simple and the computational resources for its applications are tiny and easily available, gives to the model the advantage to be used in several fields.
The power prediction, in this case, follows the behaviour of the real power measured from the HPCV with promising accuracy, even though the error value is significant for long time predictions. Nevertheless, it must be taken into account that these results are due to the small amount of data available, thus, they could be improved by collecting more data and using it to have a more precise model. Moreover, the data could be collected continuously to have a dynamic model for prediction of the output power of the HPCV system, by this way the model could have a key role to integrate with other power source and/or with the grid, or even to implement power storage in the building or area where it will be used. Other machine learning methods, as for example deep learning neural networks that have been shown over recent years to be effective at forecasting time series, will be tested in the future to compare the results with the ones obtained in this work. Besides that, as LSTM neural networks are suitable for time series forecasting, such as the one used in this work, they will be applied in further research.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: