Deep RNN-Based Photovoltaic Power Short-Term Forecast Using Power IoT Sensors †

: Photovoltaic (PV) power ﬂuctuations caused by weather changes can lead to short-term mismatches in power demand and supply. Therefore, to operate the power grid efﬁciently and reliably, short-term PV power forecasts are required against these ﬂuctuations. In this paper, we propose a deep RNN-based PV power short-term forecast. To reﬂect the impact of weather changes, the proposed model utilizes the on-site weather IoT dataset and power data, collected in real-time. We investigated various parameters of the proposed deep RNN-based forecast model and the combination of weather parameters to ﬁnd an accurate prediction model. Experimental results showed that accuracies of 5 and 15 min ahead PV power generation forecast, using 3 RNN layers with 12 time-step, were 98.0% and 96.6% based on the normalized RMSE, respectively. Their R 2 -scores were 0.988 and 0.949. In experiments for 1 and 3 h ahead of PV power generation forecasts, their accuracies were 94.8% and 92.9%, respectively. Also, their R 2 -scores were 0.963 and 0.927. These experimental results showed that the proposed deep RNN-based short-term forecast algorithm achieved higher prediction accuracy.


Introduction
Most of the electrical energy was supplied through fossil fuels. As the use of fossil fuels for power generation increases, concern for environmental pollution increases. Accordingly, unexhausted and clean renewable energy, such as solar energy and wind energy, becomes more significant and attracts more attention. The International Renewable Energy Agency (IRENA) reported that the world renewable energy capacity increased to 2537 GW in 2019 [1]. Solar energy is one of the promising renewable resources as a substitute for fossil fuels. In particular, the photovoltaic (PV) system steeply increased more than 14 times over 10 years, the capacity of which was from 41.5 GW in 2010 to 586.4 GW in 2019.
As the portion of PV power in the total electrical energy increases, the impact of PV power generation on the power grid is also increasing. PV power fluctuates rapidly with daytime weather changes due to cloud shadows and rainfalls [2]. This PV power fluctuation penetrates the power grid, causing a short-term mismatch between power supply and demand. It leads to instability and inefficiency of the power grid operation. Managing a power grid stably and efficiently requires short-term PV power forecasts to respond to PV power fluctuation [3][4][5].
The statistical and machine learning methods, such as ARMA (Auto-Regressive Moving Average), ARIMA (Auto-Regressive Integrated Moving Average), and regression models, were studied to forecast PV power generation [6][7][8]. Various neural network approaches, like artificial neural networks (ANN), gray prediction model [9], BP-ANN model [10], and the radial basis function (RBF) network [11], were proposed. To further improve the accuracy of PV power prediction, hybrid models were developed [12,13]. Those previous studies were based on historical PV power and weather data and were not suitable for real-time short term PV forecasting.
The online sequential extreme learning machine with a forgetting mechanism (FOS-ELM) was introduced to the short-term PV power prediction [14]. Predicted data computed by a weather forecast model was used as input to the PV prediction model. However, it required a large amount of computation for weather forecasting. A hybrid one-dayahead power forecasting model was proposed, combining wavelet transform, particle swarm optimization, and support vector machine (Hybrid WT-PSO-SVM) [15]. It used the actual power data from PV power system SCADA (Supervisory Control and Data Acquisition) and numerical weather forecast meteorological data for one year with onehour time step. To forecast daily generated PV power, several LSTM-RNN models were studied [16]. Wavelet-based LSTM-DNN model was proposed [17], which used predicted temperature and solar radiation for the short-term PV power prediction. Also, to enhance the short-term (24 ahead) forecasting accuracy of distributed PV power output, an adaptive hybrid predictor subset selection strategy was proposed, which combined binary genetic algorithm (BGA) and support vector regression (SVR) [18]. For a real-time short-term PV power forecast, feature data were collected at 5-min intervals [19], where a weighted Gaussian process regression was proposed to detect outliers in the data for improving the prediction accuracy.
We propose a short-term PV power generation forecast algorithm based on the deep recurrent neural network (RNN) in the paper. To reflect the impact of weather changes, the proposed model utilizes the on-site weather IoT dataset and power data, collected in real-time. To build a model using low-cost and low-computation power, we used only IoT sensors without image data, such as solar radiation, module temperature, ambient temperature, wind speed, and humidity. In experiments, we investigated the best combination of parameters to improve forecast accuracy. The proposed deep-RNN consists of multiple RNN layers. Each RNN layer includes long-short term memory (LSTM) and layer normalization units. For experiments, a very short-term forecast algorithm predicts a power 5 or 15 min ahead. A short-term forecast algorithm also predicts a power 1 h or 3 h ahead. The experimental results showed that the very short-term forecast accuracy was 99.01% (nMAE)/98.02% (nRMSE) for 5 min ahead prediction and 98.16% (nMAE)/96.58% (nRMSE) for 15 min ahead prediction using three RNN layers. And the short-term prediction accuracy was 93.75% (nRMSE) and 96.6% (nMAE) for 1 h ahead prediction and 90.29% (nRMSE) and 94.7% (nMAE) for 3 h ahead prediction. This paper organizes as follows. Section 2 describes the photovoltaic power generation system and Power IoT data. In Section 3, the short-term PV power forecasting algorithm based on deep-RNN is proposed. The experimental results and performance evaluation are discussed in Section 4, and then the conclusion is summarized in Section 5.

Photovoltaic Power Generation System with Power IoT Sensors
Since PV power systems use solar radiation to generate electricity, PV power is sensitive to weather changes. Therefore, on-site weather information as well as PV power data is needed to predict short-term PV power generation [20]. Various weather information providers like the Meteorological Administration (KMA) publish regularly observed weather data and weather forecasts. We may collect such data over the Internet. However, they are released at intervals of at least 1 h, so they are not suitable for short-term predictions. Since the published meteorological data are measured based on meteorological observations, it may differ from the on-site weather conditions of the solar system considered in this study.
To obtain PV power data and on-site weather information for this study, we adopt a data acquisition system using the PIoT (Power Internet of Things) device shown in Figure 1. PIoT devices measure the DC current and voltage to monitor generated PV power. Also, we use five IoT sensors to measure solar radiation, module temperature, ambient temperature, humidity, and wind speed. The collected PIoT data are transmitted and stored to the monitor server.

The Correlation Analysis of PIoT Data in the Photovoltaic System
In order to design a short-term forecast algorithm, it is important to select learning parameters. For PV power generation short-term forecast, we analyze the correlation between the generated power data and the PIoT data in a PV system and select the efficient parameters accordingly [21]. The Pearson correlation coefficient r is used for the correlation analysis as follows: where x and y are the individual samples and x and y are their sample means. Figure 2 shows the correlation between data parameters measured from PIoT sensors, including the power and weather data. In general, the PV power generation performance is related to solar irradiation, module temperature, and ambient temperature. However, wind speed and humidity are not taken into consideration since these are not directly related to the current and voltage equation of PV power generation. As shown in Figure 2, the humidity and wind speed have a relation to the module temperature, resulting in affecting the PV power generation indirectly [22,23]. They are less significant than solar radiation, module temperature, and ambient temperature. Depending on how far short-term prediction, the influence of the parameter will change. In this study, we investigate the prediction impacts of these parameters through experiments.

The Review of RNN and LSTM
A general artificial neural network (ANN) consists of inputs and outputs that are independent of each other. In general, inputs and outputs of time-series applications are interdependent, so the ANN is known as unsuitable for them. A recurrent neural network (RNN) is proposed for the time-series applications, as shown in Figure 3 [24]. Unlike ANN, an RNN unit takes previous hidden states and present inputs as input and makes new current hidden states and outputs. During training, the RNN parameters are updated using the backpropagation through time (BPPT) algorithm [25], which is indicated by the dotted line in Figure 3. Let's consider an example of RNN as shown in Figure 3. The hidden state h t at time t can be calculated as follows, where the present input is x t , the recurrent hidden state is h t−1 at time t − 1, σ is the sigmoid function (or other non-linear functions like hyperbolic tangent and the rectified linear function), and W and U are the weight matrices and b is the biases: If the length of a data sequence in RNN increases, the training time of RNN drastically increases. Also, RNN output can converge into 0 or infinite, which is called a vanishing/exploding gradient problem. To overcome the vanishing/exploding gradient problem of the RNN model, the long-short term memory (LSTM) network is proposed [26]. The LSTM network consists of an LSTM cell, instead of an RNN cell, as shown in Figure 4. Different from RNN cells, the LSTM cell can determine whether to retain or discard the previous hidden states through a forget gate ( f t ) and calculate current hidden states as output accordingly. The forget gate ( f t ) is computed by the sigmoid function (σ) of the input data (x t ) and previous hidden state (h t−1 ). According to the status of the previous cell (C t−1 ), which is between 0 and 1, it is determined how much of the previous hidden state is accepted. If C t−1 is zero, any value of the previous hidden state (h t−1 ) is not accepted as an input. If it is one, all of the previous hidden state (h t−1 ) is accepted. First, the forget gate f t at time t can be calculated as follows, where the forget gate weight is W f and the forget gate bias unit is b f : Second, the information value (i t ) and the new candidate value (C t ) can be calculated as follows, where the information value and the new candidate weight are W i and W C and the information value and the new candidate bias are b f and b C : Finally, the LSTM cell generates a new state value (C t ) that determines how much information is forgotten or kept at time t + 1. The output decision o t at time t can be calculated as follows, where the output decision's weight is W o and the output decision's bias unit is b o : The current hidden state (h t ) is computed by the multiplying o t and the hyperbolic tangent of C t .

Deep RNN-Based Short-Term Forecasting
For a short-term forecast of PV power generation, we propose a deep RNN with a multi-layer model as shown in Figure 5. After finishing the training process, the test process evaluates the performance of the deep RNN model using the test dataset. The proposed deep RNN model takes in input data measured by PIoT sensors, which are a PV power, solar radiation, module and ambient temperature, a humidity, and wind speed in the on-site, to forecast the short-term PV power generation as shown in Figure 6. Time-series sequence data is collected periodically through PIoT sensors. Before fed into the proposed model, it is applied to the pre-processing stage. At first, the collected sequence data is formatted to a requested interval to fit the input format of a short-term or very short-term forecast model. In the next, each formatted input data is normalized. In general, a photovoltaic system has the maximum generating electrical power capacity. Also, according to weather statistics, weather data generally vary within some range. The normalization of weather input data is computed as follows: where x is an input vector, x norm is the normalized input vector, x max is the maximum value of x, and x min is the minimum value of x. Input normalization helps to improve the accuracy and execution time of the training process [27].
The proposed deep RNN model consists of multi-layer RNN with layer normalization and one fully-connected layer. Figure 7 shows one example of a deep RNN, consisting of 3 layers with a layer normalization and one fully-connected layer. The input layer of RNN receives normalized input vectors computed in the pre-processing stage. In order to overcome the vanishing and the exploding problem of the RNN model, each RNN cell is configured with an LSTM cell introduced in Section 3.1. At the layer i, the output of the tth RNN cell is computed as follows, where x i−1 t is the output of the tth output of the (i − 1)th RNN layer including the RNN cell and layer normalization computation: In general, batch normalization is not suitable for the RNN model. Instead of batch normalization, layer normalization [28] is applied to the proposed model by computing the normalization statistics separately at each time step as follows, where LN is the computation of layer normalization:  The layer normalization at the tth time step, LN(h t ), is computed as follows: where H is the number of hidden units in an LSTM cell. is the element-wise multiplication between two vectors. b and g are bias and gain parameters of the same dimension as h t . Layer normalization improves the training and test computation time in our model. Also, it can effectively stabilize the hidden state dynamics in RNN. Finally, a fully connected layer applied to the output of the last time-step LSTM makes PV power generation prediction. We optimize the proposed deep RNN model trained with backpropagation using an Adam optimizer [29] to minimize the prediction loss. Furthermore, we apply the decayed learning rate method in the training optimization stage. As the training epoch increases and the prediction loss decreases, the learning rate is decayed to approach the optimal point. Also, we apply the gradient clipping to our deep-RNN model in order to limit the magnitude of the gradient. It can make stochastic gradient descent (SGD) behave better in the vicinity of cliffs since cliffs commonly occur in recurrent neural networks. After optimizing the training, the inference model built based on deep-RNN is used to short-term predict PV power generation. Since validation and test datasets are normalized in the pre-processing stage, the predicted output should be denormalized to compare with the measured PV power.

Experimental Environments and Performance Evaluation
For this study, we used a PV power platform, which consisted of 10 serial-connected solar panels on the roof of the Engineering Building in Konkuk University, Seoul, Korea, as shown in Figure 8. For the proposed deep RNN-based short-term PV power forecast model, we measured the on-site PV power and weather data using PIoT sensors installed in the PV power platform. The DC voltage and current of the PV system and the weather data such as solar radiation, module, and ambient temperature, humidity, and wind speed were collected using PIoT sensors summarized in Table 1. We partitioned the whole data into 3 data sets: train, validation, and test set. The ratio of them is 3:1:1. We conducted the experiments on the Nvidia Tesla P100 server. The training parameters were presented in Table 2.

Parts Device
Solar panels 10 serial-connected LG250S9W panels Maximum electrical power 2.649 kW Solar radiance sensor SE1000-SEN-IRR-S1 Module temperature sensor SE1000-SEN-TMOD-S2 Ambient temperature sensor SE1000-SEN-TAMB-S2 Wind speed sensor SE1000-SEN-WIND-S1 Humidity sensor DY-HTT1 We trained the model in 10 times and tested these models, and then reported their average prediction accuracy. To evaluate the prediction accuracy, normalized Root Mean Square Error (nRMSE), normalized Mean Absolute Error (nMAE) for the maximum power capacity of a PV system, and R 2 -score were computed as follows [30]: whereP i is the prediction power, P i is the measured power, P m is the maximum capacity of the PV power, and P is the mean of the measured power. nRMSE and nMAE can be represented as a percentage (%) since the normalized power, varying from 0 to 1, is used as inputs.

Experiment Results and Discussion
We designed a set of experiments composed of two groups: one for a very short-term forecast and the other for a short-term forecast model. Very short-term forecasting experiments were considered for 5-min and 15-min ahead predictions of PV power generation. For the short-term forecast, experiments were conducted concerning the 1 h ahead and the 3 h ahead forecast. To make the best short-term forecasting model, we performed experiments by varying the number of time steps of a RNN layer. In Table 3, the results of the very short-term forecast were presented. The time interval of an RNN time step was 5 min. Typically the change of the weather features for 5 or 15 min was relatively less than that for a longer period like 1 h, 3 h, or one day. As shown in Table 3, the deep RNN-based forecast model using 12 time-steps achieved a smaller prediction error than ones with 24, 36, or 48 time-steps. Table 4 showed the error of a very short-term forecast by varying the number of RNN layers. The deep RNN consisting of 3 layers achieved the best prediction accuracy. As the number of layers increases, the prediction accuracy was degraded a little. So, we chose the deep RNN consisting of 3 layers for overall experiments in the very short-term forecast. The 5 min ahead prediction accuracy of PV power generation was 98.9% (nMAE) and 97.9% (nRMSE). The 15 min ahead prediction accuracy of PV power was 98.2% (nMAE) and 96.5% (nRMSE) using 3-layer RNN with 12-time steps. To forecast very short-term PV power generation, we used all of PIoT sensor data based on the correlation analysis in Section 2.2. Further experiments were conducted to find the best combination of PIoT sensors with minimal forecasting error. Table 5 showed a result of very short-term forecasting errors due to the combination of PIoT sensors. For the very short-term forecasting PV power, the combination of PV Power (P), Solar (S), and Wind Speed (WS) minimized the error compared to others. Short-term temperature changes can be further affected by wind speed. Using the selected PIoT sensor combination, the 5 min ahead prediction accuracy of PV power was improved to 99.1% (nMAE) and 98.0% (nRMSE). The 15 min ahead prediction accuracy of PV power was 98.4% (nMAE) and 96.6% (nRMSE) using 3-layer RNN with 12-time steps. Figures 9 and 10 showed the results of 5 and 15 min ahead forecast using the selected PIoT sensor combination for 3 different weather cases. The deep RNN-based forecast model used 3 RNN layers, 12 time-steps of RNN, and the sampling interval of 5 min. In Figure 11, the 5 and 15 min ahead forecast results of PV power generation during 6 consecutive days were presented, compared with the power measured in the PV platform. Even though the weather changed to a cloudy or rainy day, the prediction graph in Figure 11 showed the proposed model made reliable forecast results. The scatter graphs of the very short-term forecast were presented in Figure 12. R 2 -scores of 5 min and 15 min ahead forecast were 0.983 and 0.949, respectively.   For the short-term PV power generation forecast, the weather change during the longer period should be observed, since it will be related to the PV power generation trend during a half-day or a full day. In the experiments, it was predicted based on data collected during 6, 12, 18, or 24 h. For the short-term forecast, the sampling interval between two time-steps in RNN was a half hour (30 min) or one hour. If the samples every 5 min were used, the number of time-steps in RNN increased too large to degrade the overall prediction accuracy and to increase the training time. Table 6 showed the forecast error of the short-term PV power generation according to the sampling interval, time-steps, and layers of RNN. As shown in Table 6, the model that had 3 RNN layers, 12-time steps in RNN, and the sampling interval with an hour achieved the minimized prediction error.
The 1 h ahead prediction accuracy of the PV power generation was 94.6% (nRMSE). The 3 h ahead prediction accuracy was 97.1% (nMAE) and 91.8% (nRMSE) and 95.6% (nMAE). Table 7 showed the results of short-term forecasting error considering various combinations of PIoT sensors. For the short-term forecast PV power, the combination of PV power (P), solar (S) and humidity (H) was chosen because it appeared more stable than the other combinations. Using the selected PIoT sensor combination, the 1 h ahead prediction accuracy of PV power improved to 97.4% (nMAE) and 94.8% (nRMSE). The 3 h ahead prediction accuracy of PV power was 96.2% (nMAE) and 92.9% (nRMSE) using 3-layer RNN with 12 time-steps.
In the experiments, the short-term forecast model used 3 RNN layers, 24-time steps of RNN, and the sampling interval with a half-hour. Figures 13 and 14 showed the results of 1 and 3 h ahead forecast using 3 RNN layers, 24-time steps of RNN, and the sampling interval with a half-hour. In Figure 15, the PV power generation forecast results for 6 consecutive days were compared with the power measured in the PV platform. Also, scatter graphs of them were presented in Figure 16 and R 2 -scores of 1 h and 3 h ahead prediction were 0.963 and 0.927, respectively.     We compared with other well-known forecasting models [31,32] as shown in Table 8. We implemented ARIMA model and support vector regression model using radial basis function (SVR-RBF) using scikit.learn package and optimized [33]. ARIMA and SVR-RBF showed good forecasting performance for very short-term forecasting, but their performance became degraded for short-term forecasting. On the other hand, our proposed model showed steadily forecasting performance superior to others.

Conclusions
In this paper, we proposed the deep RNN-based short-term forecast algorithm of PV power generation. For the proposed short-term forecast of PV power generation, we collect on-site power and weather information using PIoT sensors installed in a PV power system since on-site accurate weather information is significant for the short-term prediction accuracy. We performed a correlation analysis between the PV generated power data and meteorological data collected in a PV system to select the weather parameters. The correlation analysis confirmed that solar power was affected by solar irradiation, ambient temperature, and module temperature. Furthermore, we found that humidity and wind speed are significant meteorological features to the change of the PV power through experimental investigations. The proposed short-term forecast of PV power generation is designed based on a deep-RNN model using PIoT data. It consists of one input layer, three hidden RNN layers, and one fully connected output layer. The sequence inputs are normalized, and their outputs are denormalized.
To evaluate the performance of short-term forecast PV power, we performed various experiments according to the number of hidden layers, sampling data interval, and the time steps of a deep RNN-based forecast model. Experimental results showed that the prediction accuracy was the best when it consists of 3 hidden RNN layers and 12 time-steps in each RNN layer. In the experiments for the very short-term forecasting algorithm of PV power generation, the prediction accuracy was 99.1% (nMAE)/98.0% (nRMSE) regarding 5 min ahead forecast, and 98.6% (nMAE) 96.6% (nRMSE) regarding 15-min ahead forecast. The R 2 -scores of them were 0.988 and 0.949, also. In the experiments for the short-term forecasting algorithm of PV power generation, the prediction accuracy was 94.8% (nRMSE) and 97.4% (nMAE) regarding 1 h ahead forecast and 92.9% (nRMSE) and 96.2% (nMAE) regarding 3 h ahead forecast. The R 2 -scores of 1 h and 3 h ahead prediction were 0.963 and 0.927, respectively. Experimental results showed that the proposed deep RNN-based short-term forecast algorithm achieved higher prediction accuracy compared with ARIMA and SVR-RFN models.
To improve the current short-term PV power generation forecast model, we will develop new deep learning forecast models using other weather features like a cloud image and dust sensors in the future. We will study an abnormality detection method in the PV power system based short-term forecast algorithm. We expect that it will be useful for floating or marine photovoltaics.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: