Ultra-Short-Term Photovoltaic Power Prediction Model Based on the Localized Emotion Reconstruction Emotional Neural Network

Due to the intermittency and randomness of photovoltaic (PV) power, the PV power prediction accuracy of the traditional data-driven prediction models is difficult to improve. A prediction model based on the localized emotion reconstruction emotional neural network (LERENN) is proposed, which is motivated by chaos theory and the neuropsychological theory of emotion. Firstly, the chaotic nonlinear dynamics approach is used to draw the hidden characteristics of PV power time series, and the single-step cyclic rolling localized prediction mechanism is derived. Secondly, in order to establish the correlation between the prediction model and the specific characteristics of PV power time series, the extended signal and emotional parameters are reconstructed with a relatively certain local basis. Finally, the proposed prediction model is trained and tested for single-step and three-step prediction using the actual measured data. Compared with the prediction model based on the long short-term memory (LSTM) neural network, limbic-based artificial emotional neural network (LiAENN), the back propagation neural network (BPNN), and the persistence model (PM), numerical results show that the proposed prediction model achieves better accuracy and better detection of ramp events for different weather conditions when only using PV power data.


Introduction
In response to reducing carbon emission caused by the fossil fuels and following the trend of global environmental protection, photovoltaic (PV) generation has been widely used as one of the environmentally friendly power generation alternatives. However, PV power shows high intermittency and randomness due to the impacts of various meteorological factors, which hinders the development of the grid-connected PV power system [1]. Ultra-short-term PV power prediction is considered for intra-hour prediction. The accurate prediction of PV power from a few seconds to one hour is important to assure grid quality and stability and can effectively help the grid to perform power smoothing [2]. Therefore, an effective and accurate prediction model for PV power is of great importance.
Physical methods and statistical methods can be used for ultra-short-term PV power prediction [3,4]. Physical methods are based on physical equations describing the laws of solar radiation and the operation of PV modules, as well as the detailed data from numerical weather prediction (NWP) [5]. The cloud image-based prediction method in the physical method can achieve high precision ultra-short-term PV power prediction by monitoring cloud movement [6][7][8]. The physical method does not require a lot of historical data, but it is difficult to simulate some extreme weather neural network. So far, ENNs have not been applied to the field of PV power prediction. In this paper, an ultra-short-term PV power prediction model with localized emotion reconstruction in the LiAENN is proposed, which is combined with the idea of phase space reconstruction in chaotic time series analysis.
The contributions of this paper are as follows: (a) The chaos theory is first combined with neuropsychological theory of emotion to improve the LiAENN-based model; the proposed LERENN-based prediction model provides a new direction for ultra-short-term PV power prediction.
(b) By mining the hidden information of PV power time series and deriving the single-step cyclic rolling localized prediction mechanism, the influence of human subjective factors in the prediction process can be reduced.
(c) The reconstructed extended signal and emotional parameters according to the derived single-step rolling localized prediction mechanism makes the correlation between the prediction model and the characteristics of the PV power time series more accurate, which can further improve prediction accuracy.

The Neuropsychological Aspect of Emotion
The emotion plays an essential role in human cognition and perception process, which happens in the limbic system. Figure 1 shows the schematic diagram of the amygdala interaction with other brain systems in the limbic system. The limbic system commonly includes amygdala, orbitofrontal cortex (OFC), thalamus, sensory cortex, hypothalamus and hippocampus. It can be clearly seen that the amygdala is highly connected with other limbic system components, such as thalamus, sensory cortex and OFC. The amygdala is responsible for dealing with emotional stimulus which comes from two pathways: one is directly transmitted by the thalamus, which is short and inaccurate; the other is derived from the sensory cortex, which is long but accurate. The important position of the amygdala in emotional processing indicates that it is the centerpiece of neuroeconomic decision-making. In the LiAENN from Figure 2, the amygdala and OFC module are expanded into two layers with two hidden neurons and single output neuron by using biases bs and activation function f to introduce the anxiety-confidence emotional states into the network [30]. The emotional stimuli P q (q = 1,...,n) as the input patterns enter into the thalamus and then go to the sensory cortex. Amygdala receives the input information from the sensory cortex. The thalamus also maps the expanded signals P n+1 extracted from input information directly to the amygdala. The emotional output of amygdala is Ea. The OFC produces the emotional output Eo, which is to inhibit the inaccurate emotional response of the amygdala and determine the final emotional output E. The dashed line in Figures 1 and 2 represent the feedback effects of resultant emotional response. In the LiAENN, the target value of input pattern T which controls the feedback effects is employed to adjust the amygdala weights vs and the OFC weights ws. Additionally, to control the effects of using targets, the network uses a decay rate to simulate the oblivious characteristic of amygdala.
The LiAENN is trained with the anxious confident decayed brain emotional learning rules (ACDBEL), wherein the added anxiety-confidence emotional state and their attentional effects are used in learning in the amygdala. The emotionally derived concepts in LiAENN provide a new direction for its application in the ultra-short-term prediction of PV power.

The Limitations of the Expanded Signal
The existence of the short path not only allows the model to react faster to a wide range of stimuli, but also provides another pathway for emotional learning if the long path is damaged. However, inappropriate expanded signals also bring greater challenges to the application of prediction such as interfering with other input information or leading to the redundancy of information. The two-layer architecture of LiAENN has two short paths, which increases the inaccuracy of the information transmitted through the short paths. The expanded signal, which is transmitted through the short path, is usually calculated using a nonlinear function such as a mean operator or a max operator. A mean operator represents the average value of the input signals, which is used to simulate the average trend of input signals. A max operator, which is the maximum value of input signals, is chosen to simulate the expanded signal in most ENNs. Although the applications of ENNs are becoming more and more mature, the expanded signal is still not clearly defined as to whether it is a uniform regulation or a choice based on a particular application. However, the precise definition of the extended signal is critical to the accuracy of the prediction. In this paper, the expanded signal is considered according to the chaotic characteristics of PV power time series and the prediction mechanism.

The Limitations of the Emotional Parameters
The LiAENN is distinctive because more emotional concepts are involved in the emotional computing. The added emotional parameters referring to the anxiety and confidence more closely mimic the attentional behavior of human learning. The confidence and anxiety variables are influenced by the perceived objects. Emotional psychology theory holds that the new learning task will bring a high initial anxiety level and low confidence level. Rather, the proficiency of practice will lead to a lower level of anxiety and a higher level of confidence. Confidence makes the previous update occupy a dominant position. Anxiety has an effect on enhancement, including latest errors, which effectively slows down the learning of new tasks. Hence, anxiety can be seen as a feature of attention focusing on learning about new and "interesting" data. The choice of these data should be considered in terms of the interaction mechanism between emotion and attention. The amygdala is responsible for attentional behavior, which can eliminate interference items from desired target objects to obtain the salience. The interaction mechanism between emotion and attention can be summarized as follows: the attention is the first step in emotion processing, and on the contrary, emotional function helps to guide attention to a great extent. The important role of the amygdala in attention and memory suggests that the quick low-level automatic emotional responses are derived from the most important stimuli associated with survival [31]. Hence, these new and "interesting" data should be determined according to the specific application. On the facial detection and emotion recognition experiments, the new and "interesting" data are resourced from the average value of the global input signals, whose goal is to mimic trends in human emotional judgments and preferences based on general impressions, rather than precise details of perceived objects.
The fluctuation of wind power is mainly reflected in the hourly fluctuation, whereas the PV power has stronger fluctuation in a few minutes. Therefore, for the ultra-short-term PV power prediction, the global average of the input signals does not provide the most direct stimulus to the emotional learning of the network. It is especially important to pick out the most important information about the prediction from the input signals. The improvement of LiAENN concentrates on tracking the detailed information of input signals, which is crucial for ultra-short-term PV power prediction. The construction of emotion parameters has a relatively certain local basis based on the chaos theory.

Chaotic Time Series Analysis
If the behavior of the observed time series data is chaotic, it can be assumed that the behavior follows a certain deterministic law in the high-dimensional phase space. Considering the chaotic characteristics of PV power helps to better explore the relationship between emotion and attention.
The key point of this approach lies in the phase space reconstruction of the dynamics, which is aimed at mapping these historical time series into high-dimensional phase space, and then extracting and restoring the original law. The original law is a kind of trajectory in high-dimensional space, which is called chaotic attractor [32]. For a chaotic system, the phase space is defined as a vector space R m , where each point is represented by an m-dimensional vector r(t), which is expressed as: where t is the index of the time series and m is the dimension of the vector space. According to Taken's embedding theorem, the value of r(t) and its related components r 1 (t), r 2 (t), ..., r m (t) are unknown in the chaotic system. However, the evolution of any component of the system can be determined by other components interacting with it, so the information of these related components is implicit in the development of any components. This means that if a single quantity or variable x(t) can be observed from a chaotic system, the chaotic attractor can be recovered from the reconstructed dynamics of a system X(t) = [x(t), x(t + τ), x(t + 2τ)...] after a certain time delay τ, which are geometrically similar to the original attractor. Therefore, the reconstructed phase space X(t) → X(t + τ) can be used to reflect the unknown dynamics of the actual system r(t) → r(t + τ) [33]. The future value of system at time t + τ can be determined by the following equation with the nonlinear function f : R m → R m , which describes the system: where the arrows appearing in the text represent a mapping from one-dimensional space to multi-dimensional space. Thus, although the PV power time series is random, its deterministic behavior can be described in the embedded phase space. The first step in reconstructing the PV power chaotic time series into phase space points is to determine the embedding dimension m and delay time τ based on the embedding theorem. Due to the small amount of calculation and strong anti-noise performance, the C-C method is used to calculate the phase space reconstruction parameters [23]. It is a kind of time delay window technique based on time series. The delay time τ is obtained by multiplying the delay amount l and the sampling time ∆t. Taking into account the discreteness of the sampling data, we use the delay amount l instead of delay time τ.
First, the correlation integral of PV power time series x(i)( i = 1, 2, ..., N) is given as follows: where N is the length of time series; M is the number of delay vectors; r d (r d > 0) is defined as the spatial distance; and H(a) is a step function, i.e., H(a)=0 if a < 0; 1 otherwise. X e and X f are the random point vectors of the PV power output time series in the reconstructed phase space. Infinite norm is used to calculate the Euclidean distance between X e and X f . The BDS (Brock-Dechert-Scheinkman) statistic is applied to obtain the appropriate estimation of m and r d. Choose: where d ∈ (1, 2, 3, 4) and σ is the standard deviation of the time series. Second, the PV power output test statistics are computed. Considering the limited sequence length and the possible relationship among the time series data points, we divide the PV power output time series x(i) into l sequences with a length N/l. We define CC = 1 as an intermediate variable. The system test statistics S(l) can be found when N is large enough (or approaches the infinity in theory): The first zero crossing of S(l) is selected as the optimal delay amount l opt of PV power output time series phase space reconstruction. Get the difference between the maximum value and the minimum value of S(l) for r d given the same m and l. ∆S(l) is defined as the average value of the difference with different dimension m, i.e., On account of the finite length and noise effect of the time series data, S(l) may not reach a zero-crossing point. Then, the first local minimum value of ∆S(l) can be chosen to determine optimal delay amount l opt in the time series phase space reconstruction. A new statistic S cor (l) is then defined as: Determine the global minimum value of S cor (l), which corresponds to the average trajectory cycle optimal estimate l * . We have the best dimension m opt .
Then, the PV power time series x(i) can be embedded in m-dimensional space by plotting the delay vector X:

The Single-Step Cyclic Scrolling Localized Prediction Mechanism
The single-step cyclic scrolling localized prediction mechanism [34] is described as follows: x(t 1 ) is assumed as the first power to be predicted and the x(t 0 ) is the known quantity. When the power x(t 1 ) is to be predicted at time t 1 , the correspondence between X(t 0 ) and x(t 1 ) is as follows: where the arrow represents the corresponding relationship between input and output when the model predicts the power value x(t 1 ) at time t 1 . Then, the delay vector X(t 0 ) is imported into the trained model, and the single-step prediction is performed. Hence, the predicted power x(t 1 ) pre at t 1 time is obtained.
When the power at t 2 time is to be predicted, considering that at t 2 time, the actual PV power x(t 1 ) real at the t 1 time is available, the x(t 1 ) real can be added to the last position of the phase space vector X(t 1 ) (x(t 1 )=x(t 1 ) real ). Based on the phase space reconstruction, a new chaotic phase space based on x(t 1 ) is constructed as follows: Then, the delay vector X(t 1 ) is imported into the trained model, and the power x(t 2 ) at t 2 time can be predicted. The effect of rolling forward one step is realized, and the cycle is used to realize the ultra-short-term prediction of every moment in the future day. The pattern-target samples extracted from PV power chaotic time series are shown in Table 1 and the whole prediction mechanism is shown in Figure 3.
It is worth noting that each update forms a new set of chaotic phase space points, and the only unknown value in the actual prediction process refers to the last phase space point of each delay vector, which can be defined as prediction center point. The mapping relationship is more precise. This prediction mechanism ensures that the model can be adjusted by pattern-target samples and the next predicted value is not affected by the previous predicted value, which avoids the problem of error accumulation in rolling prediction.

Time
Pattern Target

A. Expanded Signal
Based on aforementioned analysis, the actual prediction process can be described as a localized rolling prediction between a single delay vector with m input components and one output component. The mapping relationship shown in Table 1 indicates that the prediction center point is critical to the prediction value, which can be chosen as the expanded signal. The expanded signal is expressed as follows:

B. Emotional Parameters
The motivation for modifying these emotional parameters is our human cognitive process of new learning tasks. For the ultra-short-term prediction of PV power, the delay vector X is a time-dependent sequence transformed from the initial observation x(i) after stretching and folding. According to the prediction mechanism and the embedding theorem, the network tracks one pattern at a time. The anxiety level is affected by each pattern-target sample which is exposed to the network, and the effect of each component of single pattern on anxiety level increases with time. The predicted value is determined by x(t j ) to a large extent. Based on the above analysis, the emotional parameters can be successfully modeled within the network configuration by paying attention to the details of each pattern-target sample instead of the general impression.
The anxiety coefficient and confidence coefficient can be expressed as µ and k. The initial values of the anxiety coefficient and the confidence coefficient are set to "1" and "0", respectively, which means that a new learning task such as first iteration needs more attention to be devoted to the learning of prediction model. With the deepening of learning or the increase of iteration steps, the decrease of anxiety level means that the derivative of the error of the training patterns is less and less valued by the network. On the contrary, increasing attention has been attached to the previous changes of network that confidence level made to weights. Therefore, the minimization of the error brings about a high level of confidence and a low level of anxiety. The anxiety and confidence maintain a balance of attention between previous iteration and subsequent iteration.
The anxiety coefficient µ(t j ) at each time can be expressed as follows: The err feedback of each pattern-target sample at each time is defined as: Then the final anxiety coefficient at the ς-th iteration can be calculated as: The value of confidence coefficient at the ς-th iteration is defined as: where µ 0 is the value of anxiety coefficient at the first iteration.
After the localized emotion reconstruction, the prediction model is trained to capture the functional relationship among given phase space points. Finally, the weights and biases of the trained model are maintained to predict the future values of the phase space points. The future values of time series are obtained when the unknown phase space points are predicted. Only the amygdala involves the emotional states, and the above process is presented in Figure 4.

Feed Forward Computations
For the amygdala, the detailed steps are as follows: i. Input Layer to Hidden Layer In Figure 4, the delay vector X(t j ) (j = 0,1,...,M-1) as the n inputs P q (q = 1,...,n) enter the amygdala, which come from the sensory cortex; meanwhile, the expanded signal P n+1 as another input enters the amygdala. For the amygdala, h i is the i-th (i = 1, 2) neuron in the hidden layer. ba 1 i is the i-th bias neuron in the hidden layer, which is set to "+1." Ea hi is the weighted sum of the inputs to the i-th neuron in the hidden layer, which can be expressed as in Equation (17). f 1 a is the activation function of the hidden layer. Ea i is the activated value of the i-th neuron as the final output of hidden layer, which can be expressed as in Equation (18). v 1 q.i is the amygdala weight associated with the connection between the q-th neuron in the input layer and the i-th neuron in the hidden layer.
ii. Hidden Layer to Output Layer Similarly, the output value of output layer in amygdala is calculated as follows: where v 2 i.1 (i = 1, 2) is the amygdala weight in the output layer located between the i-th neuron in the hidden layer and the output neuron; ba 2 1 is related to the bias neuron in output layer; f 2 a is the activation function of the output layer.
In the same way, the output Eo of OFC can be obtained, and the final output can be calculated by the following equation:

Backward Learning Computations
The backward learning computations are aimed at updating the learning weights of the amygdala and the OFC, which is similar to the error back propagation algorithm.
As can be seen from Figure 2, the output error of the amygdala is err, as shown in Equation (22): where T is the target value, and the err actually has a result of the feed forward calculations, and the amygdala is Ea.
The aim of the training process is to minimize this error over training patterns. For the output layer neuron, a quantity called the error signal is represented by ∆J a , which is expressed as: For the first hidden neuron, an error signal definition is as follows: Then, the learning weights of the first hidden neuron are calculated by the following equation: Particularly, due to the expanded single from the thalamus, where µ and k are updated based on Equations (13)-(16) at each iteration, η is the learning coefficient, and γ is the decay rate in amygdala learning rule. The v 2 1.1 and ba 2 1 are adjusted as follows: The updating between the second hidden neuron and the output neuron is similar to the backward learning computations of the OFC, and so details are no longer given here.

The Ultra-Short-Term PV Power Prediction Framework Based on the Localized Emotion Reconstruction Emotional Neural Network
After collecting the PV power time series data, the prediction can be implemented with the following steps: (a) After data normalization, the phase space reconstruction of the obtained PV power time series is performed.
(b) Construct the localized emotion reconstruction emotional neural network (LERENN)-based model; the overall frame structure of the prediction model, especially the number of input nodes and output nodes, is determined based on the data matrix of phase space point vector. Additionally, the initial values of the emotional parameters are set.
(c) Import the phase space points to the model; the proposed model is trained with the pattern-target pairs to capture the functional relationships among the given phase space points. The total training process includes the feed forward computations, emotional parameter settings, and backward learning computations. Among them, the setting of emotional parameters is carried out in accordance with 2.3.3.
(d) The weights and biases of the trained model are maintained to predict the future values of the phase space points.
(e) Repeat the above steps to perform prediction. The corresponding prediction process of PV power based on the proposed model is shown in Figure 5.

Description of Dataset
The grid-connected PV power station built by the National Institute of Standards and Technology (NIST) in Gaithersburg, MD campus can provide the high-resolution, low uncertainty, comprehensive PV output power data for extended, continuous time periods. There is a single inverter at the station that is connected to the local grid via the NIST campus grid [35]. In this paper, the data of 70 days in the third quarter of 2015 were selected for simulation. Sampling was done daily from 6:00 am to 7:00 pm every 5 min, and 157 sampling points were included in one day set. In order to obtain an appropriate prediction accuracy with an affordable computation burden, historical data of 62 days were used as the training dataset, and 8 days of data under different weather conditions were chosen as the forecasting dataset. The training dataset includes different weather conditions, and all the dataset only includes historical PV power data. To reflect the prediction performance of the proposed model, the selected forecasting dataset include 2 sunny days, 2 cloudy days, 2 overcast days, and 2 abrupt weather days like sunny to cloudy and cloudy to sunny weather [36]. For this dataset, the ultra-short-term PV power prediction was carried out with the step length of 5 min.
To quantify errors, the mean absolute percentage error (MAPE) and the root-mean-squared error (RMSE) were used as the main two metrics. In particular, MAPE and RMSE are defined as follows: where P s p and P s a are the s-th value in the predicted time series and the actual series of measured PV power, respectively, and N denotes the number of samples in test set.
However, since in some extreme weather conditions or at certain points in time, the actual PV power may fall to zero, the sum of squares due to error (SSE) defined by Equation (32) was used to represent the error in PV power prediction.
The above three evaluation metrics give the prediction information of point-wise error, however, they are not sufficient to distinguish the prediction behavior between different prediction methods. In the variability of PV power, repercussions from large ramping events are of primary concern. Hence it is useful to use ramp metric to quantify the ability of prediction methods to capture the ramp events. In this paper, we use the Ramp score proposed by Vallance et al [37] as another metric. The Ramp metric is defined as follows: where SD(T(t)) and SD(R(t)) are the slopes of the test series and real series ramps, respectively, and the t max and t min are the bounds of the period to be predicted.

Benchmark Models for Numerical Comparison
For comparison, the proposed model was compared with a persistence model (PM) [38] commonly used as a benchmark model for ultra-short-term PV power prediction. In addition, the performance of the LSTM-based model, LiAENN-based model and the BPNN-based model were also compared to the proposed model.
It is noteworthy to mention that for a fair comparison, the setting of key parameters was tested in the search of optimum values. For the proposed model, the statistic curve obtained with the C-C method is shown in Figure 6.
As can be seen from Figure 6, since S(l) has no zero crossing, the first local minimum value of ∆S(l) can be chosen to determine optimal delay amount l opt in the time series phase space reconstruction. Determine the global minimum value of Scor(l), which corresponds to the average trajectory cycle optimal estimate l * . From it, we have l opt = 12 and l * = 36. We then calculate the optimal dimension m opt = 5 via Equation (8). For decay rate γ, due to the sensitivity of the PV power chaotic system to the initial value, the γ should be set as a relatively small value. Take values at intervals of 0.05 within 0 to 1, and each training is repeated 10 times. γ= 0 is unreliable, meaning that the model barely learns new pattern-target samples during each training iteration. With the continuous increase of γ, the error jump range increases, and finally, the performance of the model tends to be unstable. When γ = 0.55, the training fails. Then, constrain the range of γ from 0 to 0.05 with step size 0.01. Finally, the value 0.01 is achieved as the optimum decay rate. For the BPNN-based model, we choose BPNN with three-layer network structure, the logsig function is used for the neural-transfer function of hidden layer, and the purelin function is used for the neuron transfer function of output layer. The weights and thresholds of the network are initialized by rand function. The number of neurons in the hidden layer is determined by trail according to the empirical formulas [39]. Namely, that, where G, l, H are the number of neurons in the input layer, the hidden layer, and the output layer, respectively; and a is a constant between 0-10.
As can be seen from Table 2, the best architecture of the BPNN-based model for PV power prediction is 5-11-1 (5 inputs, 11 hidden neurons, and 1 output). Table 3 lists the final parameters of the successfully trained models, including the BPNN-based model, the LiAENN-based model, and the proposed model.

Numerical Results and Analysis
The simulations were carried out, aimed at testing the performance of the proposed model and comparing its performance with the benchmark models. Training and testing of the prediction models were implemented in MATLAB. For a fairer comparison, each model was run 30 times independently. Figure 7 shows the prediction results of PV power under five typical weather conditions. It is clear that the five prediction models coincide well with the actual value in sunny weather from Figure 7b,c. It can be seen from the Figure 7b that the actual power curve is not completely smooth, so the prediction curves of each prediction model have different degrees of deviation throughout the prediction interval. Between 6:00 am to 7:00 am and 15:00 pm to 19:00 pm, the prediction results of the LiAENN-based model and the BPNN-based model both show significant deviations, and the BPNN-based model is the most significant. The prediction results of the PM and the proposed model are relatively close. Overall, the prediction curve of the proposed model is closer to the actual curve. However, the prediction error of the LSTM-based model, PM and the proposed model is mainly reflected in the stage of steep rise and fall of power. In order to further compare the prediction performance of the three prediction models, the prediction curve of the stage with large power fluctuation between 11:00 and 12:00 was selected to be enlarged. From the partial enlarged drawing, it can be seen that each prediction model has a certain lag when tracking PV output. During the power climbing phase, the predicted value is generally lower than the actual value, and during the power decline phase, the predicted value is generally higher than the actual value. The strong inertia effect of the PM model in a short period of time makes the dislocation between the predicted curve and the actual curve most obvious. Compared with the proposed model, the prediction error increases significantly. Compared with the LSTM-based model, the prediction ability of the proposed model at the power inflection point is better than that of LSTM model, which can detect ramp events better. The PV output power curve of Figure 7c is smoother than that on the first sunny day. The large prediction deviation of the benchmark models appears near the peak value. Combining two sunny test days, the proposed model outperforms all of the benchmark models in sunny weather.
In abrupt weather, the clouds change suddenly, and the PV power suddenly rises or falls with large fluctuation. The prediction results of each model fluctuated to a large extent. In the power smoothing phase, each model coincides well with the actual value. In the stage of large power fluctuations, as shown during 11:00 am to 16:00 am in Figure 7a and 9:00 am to 11:00 am in Figure 7f, both the LiAENN-based model and the BPNN-based model have large prediction errors. From the partial enlarged drawings, it can be seen that when the power rises and falls sharply, the prediction curve of the LSTM-based model is smoother than that of the proposed model, and the ability to detect slope events is poor. Although the prediction results based on the PM can reflect the overall trend of PV power, due to the inertia effect of the PM, when the PV power sharply rises and falls, especially at the inflection point, the tracking effect is obviously inferior to the proposed model. The proposed model can still track the original power curve well, although its prediction curve has some fluctuations. This shows that reconstructing the chaotic phase space to extract the original PV power information, and reconstructing the extended signal and emotional parameters, makes the model more sensitive to abrupt changes and fluctuations of PV power.
On cloudy days, effected by the randomness behavior of the clouds, power fluctuates greatly as the PV output is large and the prediction performance of each model is the worst in cloudy weather. From Figure 7d,e and the partial enlarged drawings, it can be seen that the proposed model still outperforms all of the benchmark models, and the BPNN-based model performs the worst. It is shown that the proposed model successfully eliminates large prediction errors, especially when the PV power fluctuates sharply.
On overcast days, the PV output is low, and the PV power fluctuation is relatively small as the cloud cover is relatively uniform. From Figure 7g,h, it can be clearly seen that the predicted values of the four models are generally smaller than the actual values in the power climbing stage. The predicted values of the four models are generally larger than the actual values in the power downhill stage. The prediction deviation is mainly reflected near the peak points and valley points; the BPNN-based model is the worst, followed by the LiAENN-based model. From the partial enlarged drawings, overall, the prediction curves of the proposed model, LSTM-based model and PM are close to each other. The prediction curve of the proposed model is closer to the actual curve means that the proposed model can improve the prediction accuracy of PV output on overcast days, but the accuracy is limited. There is still room for improvement.  To closely compare the effectiveness of the proposed model and the benchmark models, the comparison of prediction errors among different models under different weather conditions is summarized in Table 4. As can be seen from Table 4, the prediction performance of each model has the least difference in sunny weather. The proposed model outperforms all of the benchmark models under different weather conditions in general, except for individual metrics that are slightly higher than those of the LSTM-based model and PM, which are shown in bold font in the table. Focusing on the average of four metrics under various weather conditions in Table 5  Note: The definitions of abbreviations in Table 5 are the same as those in Table 4.
As a comparison, the distributions of relative error for the proposed model and benchmark models over an 8-day period are depicted in Figure 8. The percentage of the relative error is divided into 10 bins and the reduction in prediction error is highlighted in the figure. The largest proportion of reduction in prediction errors associated with the proposed model lies in the first bin; compared with the LSTM-based model, PM, LiAENN-based model and BPNN-based model, it has 9.24%, 5.34%, 14.89% and 20.39% improvement, respectively. This result validates the effectiveness of the LERENN-based model in reducing large prediction errors.
At present, the minimum time resolution of power dispatching is 15 min. To verify prediction performance comprehensively, the three-step prediction of the first, second, fourth, sixth and seventh test days was implemented.
In order to analyze the performance of each model for the three-step prediction, the prediction errors of the four models under different typical weather conditions are given in Table 6.
As can be seen from Table 6, except on overcast days, the proposed prediction model has individual metrics slightly higher than the LSTM model. Overall, the proposed prediction model has the highest prediction accuracy, and it can still detect ramp events well. Comparing Tables 4  and 6, the prediction accuracy of all five of the models deteriorates, along with the increase of the prediction steps. The deterioration of each model is different. Compared with single-step prediction, in three-step prediction the RMSE mean value of the proposed model, the LSTM-based model, PM, the LiAENN-based model, and the BPNN-based model are increased by 17.30%, 19.67%, 21.32%, 20.29% and 23.98%, respectively. Overall, the prediction performance of the BPNN-based model is the worst. The proposed model is less affected by the increase of the prediction steps. Namely, the proposed model can improve the prediction accuracy, and it is still robust to power fluctuations and weather changes.  Note: The definitions of abbreviations in Table 5 are the same as those in Table 4.

Conclusions
In this paper, a prediction model based on the localized emotion reconstruction emotional neural network for ultra-short-term prediction of PV power was proposed. Based on chaotic time series analysis, the chaotic phase space reconstruction method was used to draw the hidden characteristics of PV power time series, and the single-step cyclic rolling localized prediction mechanism was derived. The extended signal and emotional parameters were determined by the reconstructed phase space points, which have relatively sure local foundations.
Compared to the BPNN-based model, the more emotionally derived concepts in the neural network make the learning of the model more intelligent. Compared to the LiAENN-based model, the reconstructed emotional parameters and expanded signals based on the chaotic time series analysis make the model pay more attention to track each input pattern and pick out the most useful information of input pattern, with the result that the mapping relationship is more precise. Compared with LSTM-based model, the combination of chaos theory and emotion theory makes the proposed model has stronger prediction ability of ramp events. Simulation results validate that the proposed model has certain adaptability under different weather conditions.
Although in a real-world application, the utility company may argue that five minutes power prediction is less applicable since smoothing the power quality is not an easy task. In consideration of point-wise accuracy, other metrics should be considered in the next step to provide more comprehensive information in PV power prediction. In addition, meteorological factors such as solar radiation intensity and aerosol index can be used as new model inputs to further correct the prediction results. All above those are useful for the future research on the smoothing control strategy of the grid-connected PV generation system's power output using the prediction results combined with the energy storage system.