An Ultra-Short-Term PV Power Forecasting Method for Changeable Weather Based on Clustering and Signal Decomposition

: Photovoltaic (PV) power shows different ﬂuctuation characteristics under different weather types as well as strong randomness and uncertainty in changeable weather such as sunny to cloudy, cloudy to rain, and so on, resulting in low forecasting accuracy. For the changeable type of weather, an ultra-short-term photovoltaic power forecasting method is proposed based on afﬁnity propagation (AP) clustering, complete ensemble empirical mode decomposition with an adaptive noise algorithm (CEEMDAN), and bi-directional long and short-term memory network (BiLSTM). First, the PV power output curve of the standard clear-sky day was extracted monthly from the historical data, and the photovoltaic power was normalized according to it. Second, the changeable days were extracted from various weather types based on the AP clustering algorithm and the Euclidean distance by considering the mean and variance of the clear-sky power coefﬁcient (CSPC). Third, the CEEMDAN algorithm was further used to decompose the data of changeable days to reduce its overall non-stationarity, and each component was forecasted based on the BiLSTM network, so as to obtain the PV forecasting value in changeable weather. Using the PV dataset obtained from Alice Springs, Australia, the presented method was veriﬁed by comparative experiments with the BP, BiLSTM, and CEEMDAN-BiLSTM models, and the MAPE of the proposed method was 2.771%, which was better than the other methods.


Introduction
With the deteriorating global climate and rapid growth of clean energy consumption [1], solar energy resources have received attention from many countries due to the fact that they are abundant, secure, and environmentally friendly [2].The cost of PV power generation technology continues to decrease, and it is widely used in transportation, construction, and lighting industries, where its promotion has brought significant economic and environmental benefits to society [3].However, PV power generation systems are affected by weather, resulting in uncertainty and intermittency [4], and their high penetration rate also brings many challenges to the safe and stable operation of power systems, potentially leading to voltage instability, a reduced power quality, and islanding effects [5].
Accurate PV power forecasting can effectively reduce the risk faced by the grid and improve the economic efficiency of the power system.PV forecasting methods can be classified into physical, statistical, and hybrid approaches [6][7][8].In [8], different types of time series forecasting models were compared, and it was found that the accuracy of the physical model in short-term photovoltaic power forecasting was low; however, the traditional statistical model is difficult to accurately fit nonlinear photovoltaic power Energies 2023, 16, 3092 2 of 15 series, resulting in poor forecasting performance; the deep learning model can extract useful features from complex photovoltaic power series, and the forecasting effect is better; the hybrid model based on deep learning has become a research hotspot because of its excellent forecasting performance.Physical methods [9] refer to the construction of simulation models based on specific parameters of PV systems in order to calculate the output power of PV systems.Statistical methods include two categories: traditional statistical models such as seasonal autoregressive integrated moving average (SARIMA) [10], support vector regression (SVR) [11], gray models (GM) [12], etc., and artificial intelligence methods such as convolutional neural networks (CNNs) [13], long short-term memory (LSTM) [14], etc. Hybrid methods combine the advantages of many different models in order to achieve an improved forecasting accuracy.Time series forecasting models also have a wide range of applications in the fields of electricity prices, environment, and tourism [15][16][17][18].For example, [17] proposed a functional autoregressive model of order P based on the two-component estimation procedure where the accuracy of electricity price forecasting was effectively improved.
In recent years, hybrid models have become a research hotspot in the field of PV power forecasting due to their high forecasting accuracy.At present, hybrid forecasting models are mainly divided into those based on clustering algorithms and those based on signal decomposition.The study in [19] established LSTM forecasting models under ideal weather conditions and divided non-ideal weather into three types: rainy, cloudy, and overcast.In addition, a combined discrete grey model (DGM)-LSTM forecasting model was established.In [20], an extreme random tree classification model was used to classify the PV data into four categories according to the meteorological conditions.Furthermore, a power forecasting was made fully considering the influence of the changing weather conditions and the daily variation pattern of the PV power.The study in [21] used time-series generative adversarial networks (TimeGAN) to perform data enhancement and proposed a K-medoids clustering method based on soft dynamic time warping (soft-DTW) to classify the enhanced data into sunny days, cloudy days, and rainy days.The experimental results showed that the enhanced training data had a better clustering effect.The study in [22] used a self-organized map (SOM) to classify numerical weather forecast information and classified the local weather types for the next 24 h into sunny days, cloudy days, and rainy days before making the forecasts, which effectively improved the forecasting accuracy.In [23], weather was classified from 33 to 10 types based on generative adversarial networks and convolutional neural networks in order to achieve a more accurate classification.
Considering the high non-stationarity of PV power series affected by weather factors, many studies have combined signal decomposition algorithms with forecasting models.The study in [24] decomposed and reconstructed the PV power series into high-frequency and low-frequency components using integrated empirical modal decomposition (EEMD) and constructed LSTM-SVR-BO hybrid models for both of them.In [25], historical data were decomposed into several subcomponents based on the variational modal decomposition (VMD) algorithm, and the subcomponents were then input into a hybrid forecasting model composed of a convolutional neural network (CNN) and bi-directional gated recurrent unit (BiGRU).The study in [26] used EEMD to decompose the original data, merged the subcomponents with similar sample entropy (SE) together, built LSTM networks for the reconstructed components, and optimized the LSTM network parameters using the sparrow search optimization algorithm (SSA).In [27], a forecasting model optimization method based on CEEMDAN and the multi-objective chameleon swarm algorithm (MOcsa) was proposed, effectively improving the effectiveness and stability of the forecasting model.The study in [28] used a random forest (RF) to calculate the weights of each factor, filtered similar days using an improved gray ideal value approximation (IGIVA), and attenuated the volatility of the power series using the CEEMD algorithm.The study in [29] used fuzzy entropy (FE) to reconstruct the sub-sequence generated by CEEMDAN decomposition and obtained the maximum, minimum, and average values of the reconstructed sequence using fuzzy information granulation (FIG), which extracted the signal characteristics more effectively and reduced the computational complexity at the same time.The combined PV power forecasting method based on signal decomposition decomposed the original PV power sequence with a high volatility into a subseries in different frequency domains, which can effectively improve the model forecasting accuracy.
The weather classification and power forecasting methods in photovoltaic power generation have been studied in the existing literature, but there is no special research on changeable weather and the forecasting accuracy of PV power is generally low under this weather type.For this reason, a photovoltaic ultra-short-term power forecasting method for changeable weather was proposed in this paper.In this method, first, the nonlinear photovoltaic power generation in a day is linearized by CSPC, next, the changeable weather is extracted from various weather types by AP clustering, and then the changeable weather photovoltaic power generation is forecasted by modal decomposition and the separate forecasting of each component.
The rest of the paper is organized as follows.In Section 2, the clear-sky normalization method is proposed and the principle of the AP clustering algorithm is presented.In Section 3, the principles of the CEEMDAN algorithm and BiLSTM network are presented and a framework for PV ultra-short-term power forecasting is proposed.In Section 4, experiments are conducted with a PV plant in Alice Springs, Australia, and compared with other models for analysis.In Section 5, the conclusions and future research directions of this paper are given.

Clear-Sky Normalization
In order to avoid the impact of the installed capacity of different power stations, it is necessary to normalize the photovoltaic power data.The data are usually normalized to the interval [0, 1] by performing a min-max normalization based on the maximum and minimum values of the PV power series.The formula is as follows [22]: where P is the photovoltaic power value after normalization; P is the original PV power sequence; P min and P max are the maximum and minimum values of the sequence.The above method simply carries out normalization processing based on the maximum and minimum values without considering the uncertainty characteristics of the photovoltaic output variation; thus, the normalization result is random.Therefore, in this paper, normalization processing was carried out based on the uncertainty of photovoltaic power, and the maximum photovoltaic power at each moment of each month was selected in order to form the clear-sky curve of the month, thus representing the standard sunny photovoltaic power sequence of the month.Then, the PV power data were normalized to the CSPC using the clear-sky curve as the standard, record CSPC as σ: where S ij is the CSPC at time j of i days; S ij ∈ [0, 1]; P ij is the photovoltaic power at time j of i days; C j is the photovoltaic power at time j of the monthly clear-sky day; m is the number of days in the month.

AP Clustering Algorithm
The AP algorithm is an information transfer clustering algorithm that was proposed by Frey et al. in 2007 [30].The advantage of the AP clustering algorithm is that it does not need to set the number of clustering centers in advance (i.e., it can automatically complete the clustering process when the number of clustering centers is unknown).All sample data points were regarded as potential clustering centers.The number and location of clustering centers are constantly modified by transferring messages between data points and updating the attraction information and degree of belonging, selecting optimal clustering centers from the data points, and allocating the remaining points to their corresponding clustering.
We defined the similarity matrix S(i, k) in order to describe the degree of similarity between two points, that is, the degree to which point k is suitable as the clustering center of point i [30]: where x i − x k is the Euclidean distance between point i and point k, and S(i, k) is the similarity between two points.The larger the value, the more suitable point k is as the clustering center of point i.
Based on the similarity matrix S, the attraction matrix r(i, k) was constructed in order to represent the attraction information of point k to point i.The formula is as follows [30]: where r t (i, j) is the degree to which points other than data point k at time t are suitable as the clustering center of point i, and the values in r(i, k) are all greater than zero.a t (i, j) is the degree to which point i selects other points, except point k as the clustering center at time t, and the initial value is zero.The attribution matrix a(i, k) was constructed to represent the attribution information of point i to point k.The specific formula is as follows [30]: where a t+1 (i, k) is the degree to which point i selects point k as an appropriate clustering center at t + 1 time, and r t+1 (k, k) is the probability of point k being the clustering center.
In order to avoid vibration, the damping coefficient λ, which has a default value of 0.5, was introduced in order to update the iterative values of attraction matrix r(i, k) and attribution matrix a(i, k) at time t + 1 [30]:

CEEMDAN Decomposition Algorithm
PV power in changeable weather is significantly volatile and has nonlinear and nonstationary characteristics due to the variable weather factors; thus, it needs to undergo a stationary process in advance to decompose the originally complex PV power series into individual components with more concentrated fluctuation characteristics in different frequency domains.Then, a forecasting model for each subcomponent needs to be built.
Signal decomposition is a commonly used method for making time series become stationary.Empirical mode decomposition (EMD) does not need to define the basis function before decomposition, but generates the intrinsic mode function adaptively according to the characteristics of the original signal, which decomposes the complex signal into several more stationary and regular intrinsic mode function (IMF) components, reflecting the local characteristics of the original signal at different time scales.The CEEMDAN algorithm adds adaptive Gaussian white noise to the data to be decomposed at each stage and performs an overall averaging calculation for each order component, which not only effectively reduces the modal mixing of the EMD algorithm, but also solves the problem of the transfer of white noise from high to low frequencies and improves the computational speed [27].
Let E i (•) be the i-th modal component obtained by EMD decomposition, ω j (t) be the j-th added white noise, ε 0 be the standard deviation of the white noise, and x(t) be the original power signal.The calculation steps of the CEEMDAN algorithm are as follows [31]: Step 1: Add the Gaussian white noise that obeys the standard normal distribution to the signal x(t) to be decomposed in order to obtain the new signal x (t).
Step 2: Using EMD decomposition x (t) to obtain the first-order IMF component IMF 1 j and the residual signal r 1 (t), we can obtain the first-order IMF component IMF 1 (t) resulting from CEEMDAN decomposition by finding the mean value of IMF 1 j .
where N is the number of times that white noise is added.
Step 3: Add the white noise component after one EMD decomposition to the first-order residual signal r 1 (t), continue the EMD decomposition to obtain the second-order IMF component IMF 2 j and the residual r 2 (t), and derive the second-order component IMF 2 (t).
Step 4: Repeat the above steps, calculating the nth-order IMF component IMF n j and the residual r n (t) to find the nth-order component IMF n (t).
where ε n−1 is the weight coefficient of the n-1th-order white noise.
Step 5: Repeat step 4 until the residuals have a monotonic trend, and then stop the iteration, at which point, the K-th order IMF component is obtained and the original signal x(t) is decomposed as [31]: where K is the total number of IMF components obtained from the CEEMDAN decomposition.

BiLSTM Neural Network
The LSTM network is more often used as a time series algorithm, and is a special kind of recurrent neural network (RNN) that can learn the long-term dependencies of time series, and alleviate the problems of gradient disappearance and gradient explosion, which occur during the training of long series in traditional RNNs.As shown in Figure 1, the memory cell of the LSTM consists of the forget gate, input gate, and output gate.The specific calculation process of LSTM is as follows [32]: x(t) is decomposed as [31]: where K is the total number of IMF components obtained from the CEEMDAN decomposition.

BiLSTM Neural Network
The LSTM network is more often used as a time series algorithm, and is a special kind of recurrent neural network (RNN) that can learn the long-term dependencies of time series, and alleviate the problems of gradient disappearance and gradient explosion, which occur during the training of long series in traditional RNNs.As shown in Figure 1, the memory cell of the LSTM consists of the forget gate, input gate, and output gate.The specific calculation process of LSTM is as follows [32]: The forget gate is responsible for controlling the discarding of redundant information from the previous moment's cell status information The input gate determines how much new information The output gate determines the current moment network output value  The forget gate is responsible for controlling the discarding of redundant information from the previous moment's cell status information C t−1 .
The input gate determines how much new information C t is allowed to add to the cell state C t at the current moment.
The output gate determines the current moment network output value h t based on the cell state C t .and tanh(x) represent the Sigmoid and Tanh activation functions, respectively.σ(x) and tanh(x) can be expressed as follows: The BiLSTM network consists of a combination of forward and reverse LSTM networks.This structure results in the BiLSTM network being more effective than the LSTM at capturing the bi-directional dependence information between the time series and extracting the features of the PV power series.The BiLSTM network structure diagram is shown in Figure 2. From Figure 2, we can see that the calculation process of the forward LSTM structure in the BiLSTM network was similar to that of a single LSTM network, and that the implied layer state of the BiLSTM network was obtained by combining the forward implied layer state and the reverse implied layer state.Its calculation formula is subsequently shown.
tanh( ) where t x is the input at the current moment; t h is the output at the current moment; The BiLSTM network consists of a combination of forward and reverse LSTM networks.This structure results in the BiLSTM network being more effective than the LSTM at capturing the bi-directional dependence information between the time series and extracting the features of the PV power series.The BiLSTM network structure diagram is shown in Figure 2. From Figure 2, we can see that the calculation process of the forward LSTM structure in the BiLSTM network was similar to that of a single LSTM network, and that the implied layer state of the BiLSTM network was obtained by combining the forward implied layer state and the reverse implied layer state.Its calculation formula is subsequently shown.LSTM( , , ) LSTM( , , )

Combined Model Forecasting Process
This paper proposed an ultra-short-term PV power forecasting method based on a combined AP-CEEMDAN-BiLSTM model considering the volatility characteristics of the PV output.First, the mean and variance of the daily PV power data were selected as clustering indicators, and the PV output was classified into sunny days, cloudy days, and changeable days based on AP clustering.The CEEMDAN algorithm was used to decompose the changeable weather data into K different modal components to reduce the complexity of the input sequence, and to then input the BiLSTM network for training and forecasting and accumulate the forecasting results of all components.The forecasting framework of the proposed method in this paper is shown in Figure 3, which can be mainly divided into four parts: PV power normalization, weather clustering, decomposition forecasting, and denormalization.The specific forecasting process steps are as follows: Energies 2023, 16  1.Clear-sky normalization: Using the PV power history data of the whole year as the dataset, the maximum value of each moment in each month in the dataset was extracted to form the monthly clear-sky curve, which represents the standard "clear-sky days" of each month.The historical power data and the preliminary forecasted value of future power were normalized with the clear-sky curve as the standard, and the CSPC (including the real value in the past and the forecasted value in the future) was obtained; 2. AP weather clustering: The mean and variance of daily CSPC were calculated and subsequently used as clustering indicators for AP clustering, classifying data points into three weather types based on PV output characteristics: sunny, cloudy, and changeable weather; 3. Combined CEEMDAN-BiLSTM model: The CEEMDAN decomposition algorithm was used to decompose the changeable day data into n IMF components and one residual component in order to reduce the non-stationarity of the data, and they were then input into the BiLSTM network for the forecasting; 4. Clear-sky denormalization: The CSPC was denormalized according to the clear-sky curve in order to obtain the final power forecasting results.

1.
Clear-sky normalization: Using the PV power history data of the whole year as the dataset, the maximum value of each moment in each month in the dataset was extracted to form the monthly clear-sky curve, which represents the standard "clearsky days" of each month.The historical power data and the preliminary forecasted value of future power were normalized with the clear-sky curve as the standard, and the CSPC (including the real value in the past and the forecasted value in the future) was obtained; 2.
AP weather clustering: The mean and variance of daily CSPC were calculated and subsequently used as clustering indicators for AP clustering, classifying data points into three weather types based on PV output characteristics: sunny, cloudy, and changeable weather; Energies 2023, 16, 3092 9 of 15

3.
Combined CEEMDAN-BiLSTM model: The CEEMDAN decomposition algorithm was used to decompose the changeable day data into n IMF components and one residual component in order to reduce the non-stationarity of the data, and they were then input into the BiLSTM network for the forecasting; 4.
Clear-sky denormalization: The CSPC was denormalized according to the clear-sky curve in order to obtain the final power forecasting results.

Data Description
In order to verify the weather clustering method and combined model forecasting method in this paper, a site of the Desert Knowledge Australia Solar Center (DKASC) was taken as the research object [33].DKASC is located in the town of Alice Springs in the Northern Territory of Australia, which has a dry desert climate and rich solar energy resources.
The measured data of the PV output for one year and four months were selected as the sample, in which a whole year's data were used as the training set, two months of data as the verification set, and a month of data as the test set.The sampling interval was 15 min, and 96 data points were collected every day.The model uses a rolling forecast, using the values of the previous 24 h (96 values) as input to forecast the values for the next moment.Typical sunny, cloudy, and changeable days in each season were selected from the training set and are displayed in Figure 4.

Data Description
In order to verify the weather clustering method and combined model forecasting method in this paper, a site of the Desert Knowledge Australia Solar Center (DKASC) was taken as the research object [33].DKASC is located in the town of Alice Springs in the Northern Territory of Australia, which has a dry desert climate and rich solar energy resources.
The measured data of the PV output for one year and four months were selected as the sample, in which a whole year's data were used as the training set, two months of data as the verification set, and a month of data as the test set.The sampling interval was 15 min, and 96 data points were collected every day.The model uses a rolling forecast, using the values of the previous 24 h (96 values) as input to forecast the values for the next moment.Typical sunny, cloudy, and changeable days in each season were selected from the training set and are displayed in Figure 4.
The computer hardware facilities used in the experiment were: AMD Ryzen 7 5800H CPU, NVIDIA GeForce RTX 3070 graphics card, and 16 GB memory.From the various types of weather and its corresponding CSPC, as shown in Figure 4, it can be seen that the CSPC curve for sunny days was relatively smooth, with a magnitude close to 1.The volatility of the CSPC curve for cloudy days was small, with a slightly smaller magnitude.The CSPC for changeable weather had a large volatility, which was non-stationary from the point of view of the time series.

Model Evaluation Criteria
The commonly used evaluation indicators of regression models-the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE)-were used to evaluate and compare the accuracy of the forecasting model.The calculation formulas are as follows [34]: The computer hardware facilities used in the experiment were: AMD Ryzen 7 5800H CPU, NVIDIA GeForce RTX 3070 graphics card, and 16 GB memory.
From the various types of weather and its corresponding CSPC, as shown in Figure 4, it can be seen that the CSPC curve for sunny days was relatively smooth, with a magnitude close to 1.The volatility of the CSPC curve for cloudy days was small, with a slightly smaller magnitude.The CSPC for changeable weather had a large volatility, which was non-stationary from the point of view of the time series.

Model Evaluation Criteria
The commonly used evaluation indicators of regression models-the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE)were used to evaluate and compare the accuracy of the forecasting model.The calculation formulas are as follows [34]: where P t represents the actual value of power, Pt represents the forecasted value of power, and N represents the number of points of the forecasted value of future power.

Experimental Results and Analysis
First, the PV power of the whole year of the training set was taken as the dataset, the extreme value of each time and month was calculated, and the clear-sky curve of the "clear-sky day" for 12 months was extracted, as shown in Figure 5. Australia is located in the Southern Hemisphere.Alice Springs is located in the northern part of Australia and belongs to the arid desert climate.Every year, September to November is spring, December to February is summer, March to May is autumn, and June to August is winter.It can be seen from Figure 5 that the PV output fluctuates seasonally.In summer, due to weather factors such as a long duration of sunshine, the PV output value of the clear-sky curve was significantly higher than that of other seasons.The PV module works early and stops late every day, and the power generation cycle is long.In winter, the photovoltaic power generation capacity is low and the daily power generation time is also short.
where t P represents the actual value of power, ˆt P represents the forecasted value of power, and N represents the number of points of the forecasted value of future power.

Experimental Results and Analysis
First, the PV power of the whole year of the training set was taken as the dataset, the extreme value of each time and month was calculated, and the clear-sky curve of the "clear-sky day" for 12 months was extracted, as shown in Figure 5. Australia is located in the Southern Hemisphere.Alice Springs is located in the northern part of Australia and belongs to the arid desert climate.Every year, September to November is spring, December to February is summer, March to May is autumn, and June to August is winter.It can be seen from Figure 5 that the PV output fluctuates seasonally.In summer, due to weather factors such as a long duration of sunshine, the PV output value of the clear-sky curve was significantly higher than that of other seasons.The PV module works early and stops late every day, and the power generation cycle is long.In winter, the photovoltaic power generation capacity is low and the daily power generation time is also short.The historical PV power data were input into the BiLSTM model for the preliminary forecasting, and the preliminary forecasted value of the PV power at the future time was obtained.According to the corresponding clear-sky curve, the historical PV power data and the forecasted value of the future PV power data were clear-sky normalized to obtain the CSPC series.Considering the mean and variance of the PV output at all times of the day as the clustering index, the CSPC was clustered based on the AP clustering algorithm, The historical PV power data were input into the BiLSTM model for the preliminary forecasting, and the preliminary forecasted value of the PV power at the future time was obtained.According to the corresponding clear-sky curve, the historical PV power data and the forecasted value of the future PV power data were clear-sky normalized to obtain the CSPC series.Considering the mean and variance of the PV output at all times of the day as the clustering index, the CSPC was clustered based on the AP clustering algorithm, and three weather types were clustered: sunny, cloudy, and changeable.The clustering results are shown in Figure 6.In Figure 6, the horizontal axis is the mean value of the CSPC, and the vertical axis is the variance.From the classification results, the sunny day corresponded to the case that the mean value of the CSPC was large and the variance was small; the mean and variance of the changeable weather were kept in a certain numerical range; the mean value of the CSPC was small on cloudy days.In the current classification, the distance between different dates used the similarity matrix expressed by the 2-norm in Equation (4).Different ways of defining the similarity matrix will produce different results of weather type classification.Therefore, a more detailed classification of weather types can be achieved by defining a more complex similarity matrix.
Considering the non-stationarity of the changeable day data, Gaussian white noise was added to the changeable day sequence, and the CEEMDAN algorithm was used to decompose the sequence into 12 IMF components and a residual component step by step.The specific decomposition results are shown in Figure 7.As can be seen, due to the strong non-stationarity of the abruptly changeable day data, many modes were obtained by decomposition, and the fluctuation characteristics differed greatly from each other.If the forecasting is made directly without decomposition, the forecasting accuracy of the model will decline.Among them, IMF1 to IMF4 showed the characteristics of high frequency and strong randomness, which makes it difficult to forecast, but it cannot be removed as the randomness component because of its large amplitude change, otherwise it will affect the forecasting accuracy.IMF5 to IMF12 had a lower frequency and certain periodic change pattern, which makes the forecasting less difficult; Res was the trend component, and its trend indicated the overall decreasing trend of PV power.Therefore, it is important to build BiLSTM models for each component separately for training and forecasting.In Figure 6, the horizontal axis is the mean value of the CSPC, and the vertical axis is the variance.From the classification results, the sunny day corresponded to the case that the mean value of the CSPC was large and the variance was small; the mean and variance of the changeable weather were kept in a certain numerical range; the mean value of the CSPC was small on cloudy days.In the current classification, the distance between different dates used the similarity matrix expressed by the 2-norm in Equation (4).Different ways of defining the similarity matrix will produce different results of weather type classification.Therefore, a more detailed classification of weather types can be achieved by defining a more complex similarity matrix.
Considering the non-stationarity of the changeable day data, Gaussian white noise was added to the changeable day sequence, and the CEEMDAN algorithm was used to decompose the sequence into 12 IMF components and a residual component step by step.The specific decomposition results are shown in Figure 7.As can be seen, due to the strong non-stationarity of the abruptly changeable day data, many modes were obtained by decomposition, and the fluctuation characteristics differed greatly from each other.If the forecasting is made directly without decomposition, the forecasting accuracy of the model will decline.Among them, IMF 1 to IMF 4 showed the characteristics of high frequency and strong randomness, which makes it difficult to forecast, but it cannot be removed as the randomness component because of its large amplitude change, otherwise it will affect the forecasting accuracy.IMF 5 to IMF 12 had a lower frequency and certain periodic change pattern, which makes the forecasting less difficult; Res was the trend component, and its trend indicated the overall decreasing trend of PV power.Therefore, it is important to build BiLSTM models for each component separately for training and forecasting.The BiLSTM models were established to forecast each modal component of the CSPC on changeable days, and the forecasting results were de-normalized according to the corresponding clear-sky curve to obtain the final photovoltaic power forecasting results.In order to verify the validity of the forecasting method proposed in this paper, the BP neural network, BiLSTM neural network, and CEEMDAN-BiLSTM forecasting model were established for changeable weather, respectively.According to the evaluation formula described in Section 4.2, the errors of various forecasting methods were compared, as shown in Table 1.As can be seen from Table 1, traditional power forecasting algorithms such as BP and BiLSTM had a large deviation for the power forecasting of changeable weather, mainly because the photovoltaic power curve corresponding to changeable weather had obvious non-stationary characteristics, which makes it difficult to find the regularity in the learning of the neural network.In the method proposed in this paper, the clear-sky normalization method changed the nonlinear output of photovoltaic into linear CSPC in a day.At the same time, based on the CEEMDAN modal decomposition method, the time series of non-stationary CSPC corresponding to changeable weather was divided into a number of modes.The BiLSTM neural network is suitable for the learning and forecasting of this information.The MAE, MAPE, and RMSE of this method were 0.029 MW, 2.771%, and 5.530 MW, respectively, which were much smaller than those of the other models.Thus, for the time series with non-stationary characteristics, the methods of linearization and The BiLSTM models were established to forecast each modal component of the CSPC on changeable days, and the forecasting results were de-normalized according to the corresponding clear-sky curve to obtain the final photovoltaic power forecasting results.In order to verify the validity of the forecasting method proposed in this paper, the BP neural network, BiLSTM neural network, and CEEMDAN-BiLSTM forecasting model were established for changeable weather, respectively.According to the evaluation formula described in Section 4.2, the errors of various forecasting methods were compared, as shown in Table 1.As can be seen from Table 1, traditional power forecasting algorithms such as BP and BiLSTM had a large deviation for the power forecasting of changeable weather, mainly because the photovoltaic power curve corresponding to changeable weather had obvious non-stationary characteristics, which makes it difficult to find the regularity in the learning of the neural network.In the method proposed in this paper, the clear-sky normalization method changed the nonlinear output of photovoltaic into linear CSPC in a day.At the same time, based on the CEEMDAN modal decomposition method, the time series of non-stationary CSPC corresponding to changeable weather was divided into a number of modes.The BiLSTM neural network is suitable for the learning and forecasting of this information.The MAE, MAPE, and RMSE of this method were 0.029 MW, 2.771%, and 5.530 MW, respectively, which were much smaller than those of the other models.Thus, for the time series with non-stationary characteristics, the methods of linearization and mode decomposition are helpful to improve the accuracy of forecasting.The ACF and PACF for the final residuals for changeable days are shown in Figure 8.In Figure 8, the red dot in the left image represents the autocorrelation function of the final residual sample, the red dot in the right image represents the partial autocorrelation function of the final residual sample, the Abscissa is the number of lags, and the blue line represents 95% confidence interval.
mode decomposition are helpful to improve the accuracy of forecasting.The ACF and PACF for the final residuals for changeable days are shown in Figure 8.In Figure 8, the red dot in the left image represents the autocorrelation function of the final residual sample, the red dot in the right image represents the partial autocorrelation function of the final residual sample, the Abscissa is the number of lags, and the blue line represents 95% confidence interval.In the test set, data of a typical day during changeable weather were selected, and the forecasting results of the four methods were compared, as shown in Figure 9.It can be seen that the fluctuation in the power output curve of changeable weather was very strong, and the PV power at adjacent moments was significantly different, which is very difficult to forecast, resulting in the various methods having an uneven forecasting accuracy.The three comparison models had a large deviation at the moment of drastic changes in the photovoltaic power, whereas the curve of the model in this paper basically conformed to the real value.In the test set, data of a typical day during changeable weather were selected, and the forecasting results of the four methods were compared, as shown in Figure 9.It can be seen that the fluctuation in the power output curve of changeable weather was very strong, and the PV power at adjacent moments was significantly different, which is very difficult to forecast, resulting in the various methods having an uneven forecasting accuracy.The three comparison models had a large deviation at the moment of drastic changes in the photovoltaic power, whereas the curve of the model in this paper basically conformed to the real value.
mode decomposition are helpful to improve the accuracy of forecasting.The ACF and PACF for the final residuals for changeable days are shown in Figure 8.In Figure 8, the red dot in the left image represents the autocorrelation function of the final residual sample, the red dot in the right image represents the partial autocorrelation function of the final residual sample, the Abscissa is the number of lags, and the blue line represents 95% confidence interval.In the test set, data of a typical day during changeable weather were selected, and the forecasting results of the four methods were compared, as shown in Figure 9.It can be seen that the fluctuation in the power output curve of changeable weather was very strong, and the PV power at adjacent moments was significantly different, which is very difficult to forecast, resulting in the various methods having an uneven forecasting accuracy.The three comparison models had a large deviation at the moment of drastic changes in the photovoltaic power, whereas the curve of the model in this paper basically conformed to the real value.

Conclusions
The accuracy of power forecasting has become an important technical challenge due to the high uncertainty of PV power forecasting, especially in the case of changeable weather.In this paper, we used the PV power curve information of standard clear-sky days to normalize the daily PV power curve into CSPC.On this basis, different types of weather days were classified, changeable types of weather were selected, and the corresponding PV power was decomposed and forecasted.The following conclusions were drawn: 1.
The normalized daily CSPC could reflect the weather changes that affect photovoltaic power generation to a certain extent.In this paper, the weather types were divided into sunny days, cloudy days, and variable days, which can be further divided into more complex types based on the curve characteristics of the daily CSPC.2.
Due to the complexity of changeable days, the PV power curve has a very strong non-stationary feature, which is liable to cause low forecasting accuracy.The PV output power curve in a day can be linearized by the clear-sky normalization method, the method of modal decomposition, and the strategy of forecasting each component separately are helpful to improve the accuracy.

tC
 is allowed to add to the cell state t C at the current moment.

bbb
are the computed results, weight matrices, and bias terms of the input gates, respectively; are the computed results, weight matrices, and bias terms of the output gates, respectively; are the computed results, weight matrices, and bias terms of the forgetting gates, respectively; t C and t C  denote the current and pre- vious cell state; ( ) x  and tanh( ) x represent the Sigmoid and Tanh activation functions, respectively.( ) x  and tanh( )x can be expressed as follows:

Figure 3 .
Figure 3.The ultra-short-term PV power forecasting framework based on weather type clustering.

Figure 3 .
Figure 3.The ultra-short-term PV power forecasting framework based on weather type clustering.

Figure 4 .
Figure 4. Comparison of the PV power and the CSPC for the sunny, cloudy, and changeable days for each season.

Figure 4 .
Figure 4. Comparison of the PV power and the CSPC for the sunny, cloudy, and changeable days for each season.

Figure 5 .
Figure 5. Monthly clear-sky curve of the training set.

Figure 5 .
Figure 5. Monthly clear-sky curve of the training set.

Figure 6 .
Figure 6.Results of the weather clustering.

Figure 6 .
Figure 6.Results of the weather clustering.

Figure 8 .
Figure 8.The ACF and PACF plots of the final residuals for changeable days.

Figure 9 .
Figure9.Forecasting results of changeable days using different methods.

Figure 8 .
Figure 8.The ACF and PACF plots of the final residuals for changeable days.

Figure 8 .
Figure 8.The ACF and PACF plots of the final residuals for changeable days.

Figure 9 .
Figure 9.Forecasting results of changeable days using different methods.

Figure 9 .
Figure9.Forecasting results of changeable days using different methods.
24)where x t is the input at the current moment; h t is the output at the current moment; i t , W i , b i are the computed results, weight matrices, and bias terms of the input gates, respectively; o t , W o , b o are the computed results, weight matrices, and bias terms of the output gates, respectively; f t , W f , b f are the computed results, weight matrices, and bias terms of the forgetting gates, respectively; C t and C t denote the current and previous cell state; σ(x)

Table 1 .
Comparison of the PV power forecasting errors of different methods.

Table 1 .
Comparison of the PV power forecasting errors of different methods.