Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network

Wu, Ze; Pan, Feifan; Li, Dandan; He, Hao; Zhang, Tiancheng; Yang, Shuyun

doi:10.3390/su142013022

Open AccessArticle

Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network

by

Ze Wu

¹,

Feifan Pan

¹,

Dandan Li

¹,

Hao He

¹

,

Tiancheng Zhang

¹ and

Shuyun Yang

^1,2,*

¹

College of Resource and Environment, Anhui Agriculture University, Hefei 230036, China

²

Hefei Agricultural Environmental Science Observation and Experiment Station, Ministry of Agriculture and Rural Affairs, Hefei 230036, China

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(20), 13022; https://doi.org/10.3390/su142013022

Submission received: 11 August 2022 / Revised: 25 September 2022 / Accepted: 8 October 2022 / Published: 12 October 2022

(This article belongs to the Special Issue Sustainable Electric Power Systems: Design, Analysis and Control)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate prediction of photovoltaic power is of great significance to the safe operation of power grids. In order to improve the prediction accuracy, a similar day clustering convolutional neural network (CNN)–informer model was proposed to predict the photovoltaic power. Based on correlation analysis, it was determined that global horizontal radiation was the meteorological factor that had the greatest impact on photovoltaic power, and the dataset was divided into four categories according to the correlation between meteorological factors and photovoltaic power fluctuation characteristics; then, a CNN was used to extract the feature information and trends of different subsets, and the features output by CNN were fused and input into the informer model. The informer model was used to establish the temporal feature relationship between historical data, and the final photovoltaic power generation power prediction result was obtained. The experimental results show that the proposed CNN–informer prediction method has high accuracy and stability in photovoltaic power generation prediction and outperforms other deep learning methods.

Keywords:

photovoltaic power prediction; machine learning; CNN; CNN–informer

1. Introduction

Fossil energy such as coal and oil are the main energy sources in China. Excessive use will lead to problems such as energy depletion and greenhouse effect [1,2]. Under the contradiction between the continuous growth of energy demand and the deteriorating ecological environment, seeking high-efficiency, clean, and renewable energy is a necessary means to solve this series of problems. China has clearly stated plans of peaking carbon dioxide emissions by 2030 and striving to achieve carbon neutrality by 2060, and vigorously developing renewable energy is an important way to achieve these two goals [3]. Solar energy has the advantages of a large amount of resources, wide distribution, and zero emission [4]. Photovoltaic technology is an effective way to develop solar energy to reduce emissions, and a large number of photovoltaics have been incorporated into the grid [5]. However, photovoltaic power generation does not have the continuously adjustable and controllable characteristics of traditional power generation technology [6]. While using solar energy, the fluctuation and instability of solar energy also bring challenges to the real-time scheduling and safe operation of the power grid [7,8,9]. Therefore, the accurate and efficient prediction of photovoltaic power generation is of great significance to the safe and stable operation of the power grid system [10].

In recent years, many researchers have carried out a lot of research on the prediction of photovoltaic power generation, which can be mainly summarized into physical methods, statistical methods, and artificial intelligence methods [11,12,13]. The physical method is to directly calculate the output power of photovoltaics through meteorological forecast data and modeling calculation formulas combined with physical models [14]. The statistical method is to establish a mathematical model, input historical meteorological data, historical photovoltaic output data, and numerical weather forecast data into the model, establish the mapping relationship between input and output, and realize the prediction of photovoltaic power generation output [15], which mainly includes ARMA [16,17], ARIMA [18,19], and other methods. The artificial intelligence method is to establish a nonlinear relationship between photovoltaic data and external influencing factors (such as temperature, etc.), use computers to establish probability and statistical models based on data-driven methods, and then apply these models to predict photovoltaic power [20]. There are mainly BP neural networks [21], support vector machine (SVM) neural networks [22], radial basis (RBF) neural networks [23], long short-term memory (LSTM) neural networks [24], etc.; these methods are used more frequently in photovoltaic power generation prediction and have achieved better prediction results. Du, P. et al. [25] proposed a model of variational modal decomposition (VMD), maximum correlation minimum redundancy (mRMR), and deep belief network combination (DBN) to predict photovoltaic output, which effectively improved the prediction accuracy. Qing, X. et al. [26] proposed a deep learning method relying on an LSTM network to predict solar irradiance, using the weather forecast data a day ago as the input of prediction, and using physical theory to establish a mathematical model between irradiance and photovoltaic power to achieve indirect prediction of photovoltaic power. Wang, Y. et al. [27] used the Pearson correlation coefficient to extract the features that affect the photovoltaic power. According to the feature similarity, the training set is grouped by the K-means method. The grouped data is input into the GRU network for training and, finally, the photovoltaic power prediction value is obtained. Wang K. et al. [28] used a combination model of a long short-term memory neural network and a convolutional neural network (LSTM–Convolutional Network) to predict the output power of photovoltaics, the long short-term memory network is used to extract time series features, and the convolutional neural network is used to extract data space features, whose prediction results show superiority over the results of a single model. Zhou, N. R. et al. [29] used empirical mode decomposition to eliminate the influence of noise data on the prediction results and used the sine and cosine algorithm to determine the parameters of the long and short-term neural network (LSTM), and the model accuracy improved, but when there are many sequences of input data, the LSTM network cannot mine nonlinear valid information between consecutive data.

Artificial intelligence is also widely used in the forecasting of wind energy and load, etc. In the work of Shang [30], singular spectrum analysis (SSA) was used to decompose the original wind speed into several subsignals, and then the CNN–ATT model was used to predict the wind speed. In the work of Alkesaiberi [31], the Bayesian optimization (BO) method was used to optimize and adjust the hyperparameters of Gaussian process regression (GPR), support vector regression (SVR), and different kernels, and dynamic information was added in the construction process. In order to further improve the prediction performance of the model for wind energy, Lin et al. [32] used the multispace-time scale temporal convolutional network method to process the load data, reduce the error of the load data noise, enhance the time series features, and then make predictions, which can improve the short-term load prediction accuracy. Huang et al. [33] decomposed the power load data into several subsequences, so that the complexity of these subsequences differed significantly, and then used BPNN and transformer models to predict low-complexity subsequences and high-complexity subsequences, respectively. Finally, the prediction results of each subsequence were superimposed to obtain the final prediction result. In the work of L’Heureux [34], a transformer-based load forecasting architecture was proposed, using this approach to process time series with contextual data and outperforming state-of-the-art Seq2Seq models. Predictive capabilities in deep learning also play a vital role in future energy. The design of a monitoring and peak load forecasting system was proposed in the work of Laayati [35] and tested on a pilot open pit mine, which can record, monitor different energy and grid quality data, and provide insights into the real-time health of the grid, successfully applying AI techniques to real world systems.

In summary, most of the recent advances in photovoltaic power prediction are built on machine learning, artificial neural networks (ANN), recurrent neural networks (RNN), and convolutional neural networks (CNN). These methods can effectively forecast photovoltaic power, but when the number of inputs is large and the length of output data becomes longer, problems such as gradient disappearance and gradient explosion are prone to occur [36,37,38]. The effect of these methods is not really satisfactory. How to predict more accurately in the big data environment of the long-term sequence of photovoltaic power is an urgent problem to be solved.

The novelty of this study is that the informer model is applied to PV power prediction and combined with correlation analysis, similar day clustering, CNN network connection informer model, and other methods to achieve PV power prediction under different weather conditions.

In the existing studies, some researchers achieved good results in PV power prediction for short time series, but not for long time series. The input of the LSTM model is long time series, but was proved that it cannot save the long time series and cannot capture the long-term dependence relationship from the long time series, which reduces the prediction performance [39]. The performance of the transformer model in capturing long-term dependencies is better than the RNN and LSTM model. The transformer model adopts a self-attention mechanism to reduce the maximum length of network signals and avoid cycle structure, but the time complexity and memory utilization of a self-attention mechanism are O(L²) [40]. Therefore, the application in the learning process of long time series is limited. In order to solve the above problems, Zhou [41] proposed the informer model and designed a multihead ProbSparse self-attention mechanism to reduce the complexity and memory utilization of the self-attention mechanism to O(L log L). It has been applied in wind power prediction [42], power load prediction [43], motor bearing vibration time series prediction [44], load forecasting of district heating systems [45], and so on. At present, it has not been applied in photovoltaic power prediction.

Photovoltaic power data is a kind of long-term periodic time series. The multihead Probsparse self-attention mechanism of the informer model may ignore the periodicity of photovoltaic power data. Based on the traditional informer model, the CNN–informer PV power prediction model based on similar day clustering was proposed. The partitioning around medoids (PAM) algorithm was used to divide the original data into four subsets of different weather conditions. By connecting the CNN network with the informer model, the local feature information can be fully extracted, and the remote dependence can be extracted from the long-time series photovoltaic power and meteorological data to further improve the prediction performance.

The contributions of this paper are as follows. (1) Aiming at the improvement of the traditional informer model, a CNN–informer photovoltaic power prediction model with similar daily clustering was proposed, which uses a cnn to extract the feature information and trends of long-term series of photovoltaic and meteorological data and uses the informer model to predict photovoltaic power. (2) Using the partitioning around medoids (PAM) algorithm to divide the daily weather types of the dataset into four subsets, namely, sunny day, cloudy day, cloudy and rainy day, and rainy day and to divide each subset into different training and test sets to input into the model for training and prediction. This method improves the prediction accuracy.

The rest of the paper is organized as follows. Section 2 presents the method. Section 3 presents the framework of the proposed methodology. Section 4 presents the results and analysis of the experiment. Section 5 states conclusions.

2. Methodology

2.1. Correlation Analysis

Due to the uncertainty of the photovoltaic system, photovoltaic power generation will be affected by meteorological factors such as solar irradiance, temperature, humidity and atmospheric pressure, but different meteorological factors have different effects on photovoltaic power. In order to fully analyze the output relationship of different meteorological factors to photovoltaic power generation, the Pearson and Spearman correlation coefficients were used to calculate the correlation coefficient of each meteorological factor with respect to the power generation. For two-dimensional linear continuous signals x and y, the Pearson and Spearman correlation coefficients are expressed as:

P_{x y} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} (x_{i} - \bar{y})^{2} (y_{i} - \bar{y})^{2}}}

(1)

F_{x y} = \frac{6 \sum (R (x_{i}) - R (y_{i}))^{2}}{n (n^{2} - 1)}

(2)

where P_xy are Pearson correlation coefficients;

\bar{x}

and

\bar{y}

are the mean of the two signals. F_xy are Spearman correlation coefficients; R(x_i) and R(y_i) are the ordering of x_i and y_i in their respective signals.

2.2. Similar Day Clustering

Partitioning around medoids is a type of K-center point clustering algorithm. The basic principle is randomly selecting a data object as the initial center point, and continuously replacing the center point in each iteration to find a better center point to improve the effect of clustering; compared with the K-means algorithm, the PAM algorithm is more powerful and insensitive to noise and outlier data. In this paper, by constructing a feature vector representing each day, the PAM clustering algorithm is used to achieve weather type classification, and the original data is roughly divided into four weather types: sunny day, cloudy day, cloudy and rainy day, and rainy day. The characteristic quantity for each 1 day is expressed as:

X_{i} = [x_{i 1}, x_{i 2}, x_{i 3}, x_{i 4}, x_{i 5}, x_{i 6}, x_{i 7}, x_{i 8}]

(3)

where x_i₁, x_i₂, x_i₃, x_i₄, x_i₅, x_i₆, x_i₇, and x_i₈ are the maximum value of global horizontal radiation, the mean value of global horizontal radiation, the maximum value of weather temperature, the mean value of weather temperature, the maximum value of weather daily rainfall, the mean value of weather daily rainfall, the maximum value of wind speed, and the mean value of wind speed on day i.

2.3. Convolutional Neural Network Module

Historical photovoltaic data contains very few time information features on a single time scale, and cannot fully reflect the information and trend of time series. It is necessary to obtain more time series features from the original photovoltaic data and meteorological elements. Convolutional neural networks (CNN) have excellent feature extraction capabilities and are widely used in speech emotion recognition and face recognition [46,47,48,49]. In recent years, researchers have improved feature extraction by applying convolutional neural networks to time series data [50,51]. The essence is to use filters to extract features from data to obtain feature vectors and use activation functions to solve classification or regression problems [52]. This paper uses one-dimensional convolution to extract features from the original time series data, the formula is as follows:

y_{t} = \sum_{k = 1}^{K} w_{t} x_{t - k + 1} + b

(4)

where y_t is the output feature data; w_k is the convolution kernel; x_t-k₊₁ is the input data; b is the bias; and k is the data length.

2.4. Informer Model

The informer model is a network structure based on the attention mechanism, which mainly improves the computational efficiency of the self-attention mechanism, multilayer network stacking and step-by-step decoding methods [41].

The model is internally composed of an encoder and a decoder, and both accept data input, but the accepted input data is different. The encoder receives a set of long-term sequences of data, and the decoder receives a set of sequences of the same length as the predicted sequence and a combination of 0 values. The encoder is composed of a stack of multihead ProbSparse self-attention and distilling. The multihead sparse self-attention mechanism is a variant of the self-attention mechanism. Its advantage lies in reducing the computational complexity of each layer of attention, which can achieve O(LlogL) in terms of time complexity and memory usage and can effectively improve the prediction ability of the model. The calculation formula of the multihead sparse self-attention mechanism is as follows:

A t t e n t i o n = (Q, K, V) = s o f t m a x (\frac{\bar{Q} K^{T}}{\sqrt{D_{k}}}) \cdot V

(5)

where Q, K, and V are matrices obtained by different linear transformations of the input variables, and

\bar{Q}

is the matrix obtained by the probability sparse of Q; D_k is the dimension of the query vector and the key vector, and softmax (•) is the normalized activation function.

The purpose of the “distillation” operation is to reduce the size of the network parameters, give higher weights to the dominant features, and generate a focused feature map in the next layer. The process of the “distillation” operation from layer j to j + 1 at time t is as follows:

X_{j + 1}^{t} = M a x P o o l (E L U (C o n v 1 d {[X_{j}^{t}]}_{A B}))

(6)

where [•]_AB is the attention module in multihead sparse self-attention, Conv1d represents the convolution operation on time series, and ELU is the activation function. The “distillation” operation adds a pooling layer with stride 2, bringing the memory utilization down to O((2 − λ)LlogL), where the value of λ is small.

The decoder part uses a standard decoding structure consisting of two identical multihead attention layers. Generative inference is used to mitigate the slowdown of long-term predictions and provides the decoder with the following input vectors:

X_{f e e d . d e}^{t} = C o n c a t (X_{t o k e n}^{t}, X_{0}^{t}) \in R^{(L_{t o k e n} + L_{y}) d_{m o d e l}}

(7)

where

X_{f e e d . d e}^{t}

is the t-th input sequence of the decoder,

X_{t o k e n}^{t}

is the start marker of the t-th sequence,

X_{0}^{t}

is the 0-value combination mentioned above and is the placeholder for the target sequence of the t-th sequence. The model adopts the method of generative reasoning for decoding. Its decoder performs multihead attention operation with the intermediate result output by the encoder, adjusts the dimension of the output data through the fully connected layer, and, finally, outputs the predicted result.

2.5. CNN–Informer Photovoltaic Power Prediction Model

In order to better integrate the advantages of the CNN model and the informer model, the basic structure of the CNN–informer combination model was proposed, as shown in Figure 1.

The combined prediction model consists of a one-dimensional convolution feature extraction part and an informer feature integration prediction part. First, the input variable features of photovoltaic and meteorological variables are extracted by one-dimensional convolution, and a high-dimensional mapping feature vector is constructed. In order to enhance the ability of model feature extraction, the one-dimensional convolution in this study is set to three one-dimensional convolution layers. Since the input dimension of the combined prediction model is small, the pooling layer is not set after the convolution layer. The input part of the encoder and decoder of the informer model is used to receive the output of the convolution module, and the output of this part is combined with the input to form a fully connected layer. The data received by the encoder and the decoder of the informer model are different. The input of the encoder is a long sequence of historical data, and the input of the decoder is composed of a short sequence and a 0 value equal to the length of the predicted step. The 0 values in decoder input are used as placeholders for predicted values. After the data enters the encoder, after multiple operations of the multihead probability sparse self-attention module and the “distillation” mechanism module, an intermediate result is output. The input data of the decoder first undergoes a multihead probability sparse self-attention operation with a mask, and then performs a multihead self-attention operation with the intermediate result output by the encoder. Finally, the data output dimension is adjusted by the fully connected layer to obtain the prediction result, and the output prediction results are subjected to reverse gradient propagation to continuously optimize the model.

3. Framework of the Proposed Methodology

In order to improve the prediction accuracy of photovoltaic power generation, a prediction method combining a CNN and an informer model was proposed. This method realizes the input of different input variables for historical meteorological variables and photovoltaic power generation power, and fully considers the relationship between the power and different meteorological variables and different weather types. It mainly includes three parts: data preprocessing, similar day clustering and feature extraction, and network testing and evaluation. The basic process is shown in Figure 2.

(1): The data preprocessing part includes three aspects: abnormal data and missing data processing, data correlation analysis, and data normalization processing. Among them, the purpose of the correlation analysis between photovoltaic power and meteorological variables is to select the meteorological variables that contribute more to the power generation output and use them as the follow-up research object together with the photovoltaic power generation power. The purpose of data normalization is to eliminate the dimensional influence between indicators to solve the comparability between data indicators. After the original data is standardized, each index is in the same order of magnitude, which is suitable for comprehensive comparative evaluation.
(2): The similar day clustering and feature extraction part, firstly, divides the dataset into four subsets of sunny day, cloudy day, cloudy and rainy day, and rainy day according to the feature vector of each day as the clustering basis, and then divides the data of these four subsets into four subsets. After normalization, each subset is divided into training and test sets in a ratio of 8:2. Finally, through the training data of the one-dimensional CNN model, the parameters are continuously optimized, the optimal parameters are selected, and the model is saved. Then, the data of the test set is input into the trained one-dimensional CNN model, and, finally, the dataset with features is output.
(3): The data after 1D CNN feature extraction is input to the informer model for training and prediction, and the error evaluation of its predicted value and the predicted value of other benchmark models (BP, CNN, and CNN–LSTM) and the real value is performed.

4. Experiment and Results

4.1. Dataset Introduction

The dataset selected for this article is from the “Desert Knowledge Australia Solar Center” [53] (106.6 kW, monocrystalline silicon) and spans the period from 11 March 2010 to 8 March 2020. The dataset includes variables such as photovoltaic power, wind speed, temperature, relative humidity, global horizontal radiation, diffuse horizontal radiation, wind direction, and weather daily rainfall. The basic information of these variables is shown in Table 1. Since photovoltaic plant does not work at night, the data from 7:00 to 19:00 every day is selected, and the sampling interval is 5 min. In case of equipment failure or maintenance, some data are missing or abnormal, and the data needs to be preprocessed.

4.1.1. Data Preprocessing

During the data collection process, the sensor is prone to data missing due to network transmission quality and equipment failure, which will have a greater impact on the model prediction accuracy. Photovoltaic power plant data and meteorological data have strong time continuity, and any deletion of data will destroy the time series of data. When the missing time data span is large, data with the same weather conditions are used to fill in the gap; when the missing data is small, the linear interpolation method is used to fill in the gap, so as to obtain a complete dataset. The calculation formula is as follows:

x_{a + j} = x_{a} + \frac{i (x_{a + j} - x_{a})}{j} (0 < i < j)

(8)

where x_a_+i is the missing value at time a + i, and x_a and x_a_+j are the original data at time a and a + j, respectively.

Different data will have an adverse effect on model training due to the difference in unit and magnitude. To eliminate this effect, the data needs to be dimensionalized. In order to ensure the comparability between different types of data, the variables are normalized to [0, 1] before input. The calculation formula is as follows:

X = \frac{x - x_{m i n}}{x_{m a x} - x_{m i n}}

(9)

where X is the normalized result, x is the input value of the independent variable, x_max is the maximum value in the sample data, and x_min is the minimum value in the sample data.

4.1.2. Dataset Partitioning

According to the method in 2.2, the original dataset after preprocessing is divided into four subsets of different types. Subset 1: sunny day, subset 2: cloudy day, subset 3: cloudy and rainy day, subset 4: rainy day. In the process of inputting the data of different subsets into the model, the first 80% of the data of each subset is taken for training, and the last 20% of the data is tested.

4.2. Model Evaluation Metrics

In order to evaluate the performance and accuracy of the model, evaluation indicators such as mean square root error (RMSE), mean absolute error (MAE), and mean absolute percent error (MAPE) are used, where the smaller the values of RMSE, MAE, and MAPE, the better the prediction performance of the model. The calculation formulas are as follows (10)–(12):

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (y_{i} - y_{i k})^{2}}

(10)

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - y_{i k} |

(11)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - y_{i k}}{y_{i k}} | \times 100 %

(12)

where y_i is the ith predicted value of the model, y_ik is the ith actual value,

\bar{y}

is the average value of the test samples, and n is the number of test samples.

4.3. Input Variable Selection

There are many factors affecting photovoltaic power generation. If only the historical data of power generation is used as the univariate input of the model, since there are no other influencing factors as input, it cannot be accurately predicted at the moment of sudden change of power generation, resulting in low prediction accuracy. However, if there are too many input variables, the correlation between independent variables will lead to collinearity problems, and the more input variables, the greater the requirement for model training ability, with the model over-fitting during training and testing. Therefore, the factors that affect the power of photovoltaic power generation are manually screened out through correlation analysis, and they are used as model input variables. Part of the data after preprocessing is shown in Figure 3. In this study, the Pearson correlation coefficient and the Spearman correlation coefficient are used to evaluate the correlation between variables. The results are shown in Figure 4. The correlation between photovoltaic power generation and global horizontal radiation is the largest, and the Pearson and Spearman correlation coefficients are 0.89 and 0.90, respectively; the correlation coefficient of diffuse horizontal radiation ranks second, with 0.37 and 0.39, respectively; the correlation between power generation and weather relative humidity is the lowest, −0.29 and −0.31, respectively. So, the main influencing variable in power generation and measured data selects global horizontal radiation as input.

4.4. Parameter Settings

Using python and the deep learning library pytorch to build four models of CNN–informer, BP, CNN, and CNN–LSTM and perform model training and validation on the Nvidia GeForce RTX 2080 GPU platform, the prediction results of the four models are compared.

The model is trained by designing convolution kernels of different sizes to determine the convolution kernel that has the least impact on the accuracy of the test set. In this study, three convolutional layers are designed. The number of convolution kernels corresponding to each convolutional layer is 7, 5, and 3. The size of the established convolution kernels is 3 × 3, 4 × 4, 5 × 5; 4 × 4, 5 × 5, 6 × 6; 5 × 5, 6 × 6, 7 × 7; and 6 × 6, 7 × 7, 8 × 8. The hyperparameters are adjusted through the training set, and the model with the highest accuracy in the test set is found. Other parameters are set as shown in Table 2.

The rational use of the convolution kernel is the key for the CNN to extract the feature information of the data, and the appropriate size of convolution kernel can better capture the change rules and characteristics of the input data. Taking the rainy day experiment as an example, by changing the size of the convolution kernel of the three convolution layers, the errors are compared to determine the size of the convolution kernel of the CNN. The comparison results are shown in Table 3.

The model whose size of convolution kernel is 4 × 4, 5 × 5, 6 × 6 is compared with the models whose sizes of convolution kernel are 3 × 3, 4 × 4, 5 × 5; 5 × 5, 6 × 6, 7 × 7; and 6 × 6, 7 × 7, 8 × 8. The MAE is reduced by 13.7%, 10.0%, and 28.1%; the MAPE is reduced by 21.8%, 34.9%, and 49.4%; and the RMSE is reduced by 11.1%, 14.1%, and 19.2%. In general, the larger the convolution kernel, the better the extracted feature information. However, due to the small dimension of the input matrix in this paper, the selection of the convolution kernel is too large, which will lead to the loss of some detailed feature information, and the selection of the convolution kernel is too small. Insufficient extracted feature information leads to reduced model prediction accuracy. Through comparative experiments, the convolution kernel sizes of the three convolutional layers are finally determined to be 4 × 4, 5 × 5, and 6 × 6 to build the final prediction model (in bold).

4.5. Comparison and Analysis of Prediction Results

In order to better prove the prediction ability and generalization ability of the proposed CNN–informer model, the BP, CNN, and CNN–LSTM models were established for each weather type to realize the prediction of photovoltaic power. The prediction results were compared with the prediction results of the model proposed in this paper. In the forecast results of the four weather types, 3-day forecasts were selected, respectively, as shown in Figure 5, Figure 6, Figure 7 and Figure 8. It can be seen from the figures that the predicted curve trend of photovoltaic power generation power of the CNN–informer model proposed in this paper is basically consistent with the actual power curve in sunny and cloudy weather, and the predicted curve trend of photovoltaic power generation power in cloudy and rainy weather is generally consistent with the actual power curve. Consistent, but there is a certain range of deviation fluctuations in certain time periods, which is due to the relatively strong weather changes in cloudy and rainy days, which directly affect the real-time output of photovoltaic power generation.

4.6. Error Analysis

In order to better reflect the superiority of the CNN–informer prediction model proposed in this paper, the BP, CNN, and CNN–LSTM models are combined with the CNN–informer model to calculate the error between the predicted value and the actual value of the photovoltaic power generation power of each prediction model. The errors are compared, and the comparison results are shown in Table 4. By comparing the errors of the prediction results in Table 4, it can be found that the MAE of the CNN–informer prediction model on sunny days, cloudy days, cloudy and rainy days, and rainy days is 0.821, 1.024, 1.247, and 1.785; MAPE is 0.024, 0.037, 0.051, and 0.115; and RMSE is 1.589, 2.256, 3.712, and 4.958. Compared with the BP model, the MAE of the CNN–informer model decreased by 34.2%, 25.7%, 38.1%, and 37.9%; the MAPE decreased by 50.0%, 30.2%, 45.7%, and 44.2%; and the RMSE decreased by 7.3%, 6.5%, 7.5%, and 5.5%. Compared with CNN model, MAE decreased by 22.3%, 13.1%, 29.0%, and 33.8%; MAPE decreased by 35.1%, 24.5%, 28.2%, and 33.9%; and RMSE decreased by 2.6%, 5.2%, 6.4%, and 4.0%. Compared with the CNN–LSTM model, the MAE of the CNN–informer model decreased by 25.0%, 15.2%, 21.2%, and 29.9%; the MAPE decreased by 20.0%, 17.8%, 17.7%, and 37.8%; and the RMSE decreased by 4.2%, 4.1%, 6.1%, and 3.6%. In summary, the prediction accuracy of the CNN–informer model (in bold) is generally better than that of the BP, CNN, and CNN–LSTM models, which can effectively improve the prediction performance of photovoltaic power generation.

5. Conclusions

Photovoltaic power generation has strong uncertainty and dynamic, which leads to fluctuations in the power grid and brings new challenges to the management and operation of photovoltaic power generation systems. Therefore, accurate prediction of photovoltaic power generation is one of the key solutions to determining a reasonable operation plan and scheduling plan.

In this paper, a similar day clustering CNN–informer photovoltaic power prediction model is proposed, and its prediction results are compared. The main conclusions are as follows:

(1): Comparing the predicted value and the actual value of photovoltaic power, respectively and calculating the MAE, MAPE, and RMSE of the four models, the results show that, compared with BP, CNN, and CNN–LSTM, the CNN–informer model has the smallest prediction error for photovoltaic power and the best prediction effect.
(2): Compared with the other single prediction models, the prediction accuracy of the CNN–informer model proposed in this paper is significantly improved; and for weather types with large irradiance fluctuations, the method in this paper also has excellent prediction accuracy, which proves that the proposed method is feasible.

Author Contributions

Conceptualization, Z.W. and S.Y.; methodology, Z.W.; software, Z.W.; validation, Z.W., F.P., and H.H.; formal analysis, S.Y.; investigation, Z.W. and S.Y.; resources, S.Y.; data curation, D.L., T.Z., and F.P.; writing—original draft preparation, Z.W.; writing—review and editing, Z.W.; visualization, Z.W.; supervision, S.Y.; project administration, S.Y.; funding acquisition, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

National Key Research and Development Program of China: 2017YFD0301301.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Caineng, Z.O.U.; Songqi, P.A.N.; Qun, H.A.O. On the connotation, challenge and significance of China’s “energy independence” strategy. Pet. Explor. Dev. 2020, 47, 449–462. [Google Scholar]
Yoro, D. Section I: Introduction to CO₂ capture. In Advances in Carbon Capture; Rahimpour, M.R., Makarem, M.A., Farsi, M., Eds.; Woodhead Publishing: London, UK, 2020; Volume 1, pp. 3–28. [Google Scholar]
Han, R.; Li, J.L.; Guo, Z. Optimal quota in China’s energy capping policy in 2030 with renewable targets and sectoral heterogeneity. Energy 2022, 239, 121971. [Google Scholar] [CrossRef]
Aman, M.M.; Solangi, K.H.; Hossain, M.S.; Badarudin, A.; Jasmon, G.B.; Mokhlis, H.; Kazi, S.N. A review of Safety, Health and Environmental (SHE) issues of solar energy system. Renew. Sustain. Energy Rev. 2015, 41, 1190–1204. [Google Scholar] [CrossRef]
Parida, B.; Iniyan, S.; Goic, R. A review of solar photovoltaic technologies. Renew. Sustain. Energy Rev. 2011, 15, 1625–1636. [Google Scholar] [CrossRef]
Joskow, P.L. Comparing the costs of intermittent and dispatchable electricity generating technologies. Am. Econ. Rev. 2011, 101, 238–241. [Google Scholar] [CrossRef] [Green Version]
Anees, A.S. Grid integration of renewable energy sources: Challenges, issues and possible solutions. In Proceedings of the 2012 IEEE 5th India International Conference on Power Electronics (IICPE), Delhi, India, 6–8 December 2012. [Google Scholar]
Notton, G.; Nivet, M.L.; Voyant, C.; Paoli, C.; Darras, C.; Motte, F.; Fouilloy, A. Intermittent and stochastic character of renewable energy sources: Consequences, cost of intermittence and benefit of forecasting. Renew. Sustain. Energy Rev. 2018, 87, 96–105. [Google Scholar] [CrossRef]
Petinrin, J.O.; Shaaban, M. Overcoming challenges of renewable energy on future smart grid. Telkomnika 2012, 10, 229. [Google Scholar] [CrossRef] [Green Version]
Das, U.K.; Tey, K.S.; Seyedmahmoudian, M.; Mekhilef, S.; Idris, M.Y.I.; Van Deventer, W.; Stojcevski, A. Forecasting of photovoltaic power generation and model optimization: A review. Renew. Sustain. Energy Rev. 2018, 81, 912–928. [Google Scholar] [CrossRef]
Wang, H.; Liu, Y.; Zhou, B.; Li, C.; Cao, G.; Voropai, N.; Barakhtenko, E. Taxonomy research of artificial intelligence for deterministic solar power forecasting. Energy Convers. Manag. 2020, 214, 112909. [Google Scholar] [CrossRef]
Zheng, Z.W.; Chen, Y.Y.; Huo, M.M.; Zhao, B. An overview: The development of prediction technology of wind and photovoltaic power generation. Energy Procedia 2011, 12, 601–608. [Google Scholar] [CrossRef] [Green Version]
De Giorgi, M.G.; Congedo, P.M.; Malvoni, M. Photovoltaic power forecasting using statistical methods: Impact of weather data. IET Sci. Meas. Technol. 2014, 8, 90–97. [Google Scholar] [CrossRef]
Mayer, M.J.; Gróf, G. Extensive comparison of physical models for photovoltaic power forecasting. Appl. Energy 2021, 283, 116239. [Google Scholar] [CrossRef]
Huang, Y.; Lu, J.; Liu, C.; Xu, X.; Wang, W.; Zhou, X. Comparative study of power forecasting methods for PV stations. In Proceedings of the 2010 International Conference on Power System Technology, Zhejiang, China, 24–28 October 2010. [Google Scholar]
Sansa, I.; Boussaada, Z.; Bellaaj, N.M. Solar Radiation Prediction Using a Novel Hybrid Model of ARMA and NARX. Energies 2021, 14, 6920. [Google Scholar] [CrossRef]
Jiang, Y.; Zheng, L.; Ding, X. Ultra-short-term prediction of photovoltaic output based on an LSTM-ARMA combined model driven by EEMD. J. Renew. Sustain. Energy 2021, 13, 046103. [Google Scholar] [CrossRef]
Bouzerdoum, M.; Mellit, A.; Pavan, A.M. A hybrid model (SARIMA–SVM) for short-term power forecasting of a small-scale grid-connected photovoltaic plant. Sol. Energy 2013, 98, 226–235. [Google Scholar] [CrossRef]
Li, Y.; Su, Y.; Shu, L. An ARMAX model for forecasting the power output of a grid connected photovoltaic system. Renew. Energy 2014, 66, 78–89. [Google Scholar] [CrossRef]
Mellit, A.; Kalogirou, S.A. Artificial intelligence techniques for photovoltaic applications: A review. Prog. Energy Combust. Sci. 2008, 34, 574–632. [Google Scholar] [CrossRef]
Duan, X.; Fan, L. Based on improved BP neural network model generating power predicting for PV system. In Proceedings of the World Automation Congress 2012, Puerto Vallarta, Mexico, 24–28 June 2012. [Google Scholar]
Malvoni, M.; De Giorgi, M.G.; Congedo, P.M. Data on Support Vector Machines (SVM) model to forecast photovoltaic power. Data Brief 2016, 9, 13–16. [Google Scholar] [CrossRef] [Green Version]
Ma, W.; Chen, Z.; Zhu, Q. Ultra-Short-Term Forecasting of Photo-Voltaic Power via RBF Neural Network. Electronics 2020, 9, 1717. [Google Scholar] [CrossRef]
Wang, L.; Liu, Y.; Li, T.; Xie, X.; Chang, C. Short-term PV power prediction based on optimized VMD and LSTM. IEEE Access 2020, 8, 165849–165862. [Google Scholar] [CrossRef]
Du, P.; Zhang, G.; Li, P.; Li, M.; Liu, H.; Hou, J. The photovoltaic output prediction based on variational mode decomposition and maximum relevance minimum redundancy. Appl. Sci. 2019, 9, 3593. [Google Scholar] [CrossRef]
Qing, X.; Niu, Y. Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM. Energy 2018, 148, 461–468. [Google Scholar] [CrossRef]
Wang, Y.; Liao, W.; Chang, Y. Gated recurrent unit network-based short-term photovoltaic forecasting. Energies 2018, 11, 2163. [Google Scholar] [CrossRef] [Green Version]
Wang, K.; Qi, X.; Liu, H. Photovoltaic power forecasting based LSTM-Convolutional Network. Energy 2019, 189, 116225. [Google Scholar] [CrossRef]
Zhou, N.R.; Zhou, Y.; Gong, L.H.; Jiang, M.L. Accurate prediction of photovoltaic power output based on long short-term memory network. IET Optoelectron. 2020, 14, 399–405. [Google Scholar] [CrossRef]
Shang, Z.; Wen, Q.; Chen, Y.; Zhou, B.; Xu, M. Wind Speed Forecasting Using Attention-Based Causal Convolutional Network and Wind Energy Conversion. Energies 2022, 15, 2881. [Google Scholar] [CrossRef]
Alkesaiberi, A.; Harrou, F.; Sun, Y. Efficient wind power prediction using machine learning methods: A comparative study. Energies 2022, 15, 2327. [Google Scholar] [CrossRef]
Yin, L.; Xie, J. Multi-temporal-spatial-scale temporal convolution network for short-term load forecasting of power systems. Appl. Energy 2021, 283, 116328. [Google Scholar] [CrossRef]
Huang, S.; Zhang, J.; He, Y.; Fu, X.; Fan, L.; Yao, G.; Wen, Y. Short-Term Load Forecasting Based on the CEEMDAN-Sample Entropy-BPNN-Transformer. Energies 2022, 15, 3659. [Google Scholar] [CrossRef]
L’Heureux, A.; Grolinger, K.; Capretz, M.A. Transformer-Based Model for Electrical Load Forecasting. Energies 2022, 15, 4993. [Google Scholar] [CrossRef]
Laayati, O.; Bouzi, M.; Chebak, A. Smart Energy Management System: Design of a Monitoring and Peak Load Forecasting System for an Experimental Open-Pit Mine. Appl. Syst. Innov. 2022, 5, 18. [Google Scholar] [CrossRef]
Li, S.; Li, W.; Cook, C.; Zhu, C.; Gao, Y. Independently recurrent neural network (indrnn): Building a longer and deeper rnn. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
Pascanu, R.; Mikolov, T.; Bengio, Y. On the difficulty of training recurrent neural networks. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
Sutskever, I.; Martens, J.; Dahl, G.; Hinton, G. On the importance of initialization and momentum in deep learning. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
Zhao, J.; Huang, F.; Lv, J.; Duan, Y.; Qin, Z.; Li, G.; Tian, G. Do rnn and lstm have long memory? In Proceedings of the International Conference on Machine Learning, Vienna, The Republic of Austria, 12–18 July 2020.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems, Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–7 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI, Palo Alto, CA, USA, 2–9 February 2021. [Google Scholar]
Wang, H.K.; Song, K.; Cheng, Y. A Hybrid Forecasting Model Based on CNN and Informer for Short-Term Wind Power. Front. Energy Res. 2022, 9, 1041. [Google Scholar] [CrossRef]
Liu, F.; Dong, T.; Liu, Y. An Improved Informer Model for Short-Term Load Forecasting by Considering Periodic Property of Load Profiles. Front. Energy Res. 2022, 10, 1015. [Google Scholar] [CrossRef]
Yang, Z.; Liu, L.; Li, N.; Tian, J. Time Series Forecasting of Motor Bearing Vibration Based on Informer. Sensors 2022, 22, 5858. [Google Scholar] [CrossRef]
Gong, M.; Zhao, Y.; Sun, J.; Han, C.; Sun, G.; Yan, B. Load forecasting of district heating system based on Informer. Energy 2022, 253, 124179. [Google Scholar] [CrossRef]
Wang, J.; Li, Z. Research on face recognition based on CNN. In Proceedings of the IOP Conference Series: Earth and Environmental Science, Kaohsiung City, Taiwan, 17–21 July 2018. [Google Scholar]
Hu, G.; Yang, Y.; Yi, D.; Kittler, J.; Christmas, W.; Li, S.Z.; Hospedales, T. When face recognition meets with deep learning: An evaluation of convolutional neural networks for face recognition. In Proceedings of the IEEE International Conference on Computer Vision Workshops, New York, NY, USA, 5–7 January 2015. [Google Scholar]
Huang, Z.; Dong, M.; Mao, Q.; Zhan, Y. Speech emotion recognition using CNN. In Proceedings of the 22nd ACM International Conference on Multimedia, New York, NY, USA, 3–7 November 2014. [Google Scholar]
Kwon, S. A CNN-assisted enhanced audio signal processing for speech emotion recognition. Sensors 2019, 20, 183. [Google Scholar]
Jin, X.; Yu, X.; Wang, X.; Bai, Y.; Su, T.; Kong, J. Prediction for Time Series with CNN and LSTM. In Proceedings of the 11th International Conference on Modelling, Identification and Control, Tianjin, China, 13–15 July 2019. [Google Scholar]
Chan, S.; Oktavianti, I.; Puspita, V. A deep learning CNN and AI-tuned SVM for electricity consumption forecasting: Multivariate time series data. In Proceedings of the 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference, Vancouver, BC, Canada, 17–19 October 2019. [Google Scholar]
Hao, W.; Yizhou, W.; Yaqin, L.; Zhili, S. The Role of Activation Function in CNN. In Proceedings of the 2020 2nd International Conference on Information Technology and Computer Application, Suzhou, China, 10–12 July 2020. [Google Scholar]
DKASC. Yulara. Available online: https://dkasolarcentre.com.au/download?location=yulara (accessed on 2 September 2022).

Figure 1. The basic structure of the CNN–informer combination model.

Figure 2. Basic flow for forecasting.

Figure 3. Part of the data after preprocessing.

Figure 4. Correlation analysis.

Figure 5. Comparison of actual and predicted values for sunny days.

Figure 6. Comparison of actual and predicted values for cloudy days.

Figure 7. Comparison of actual and predicted values for cloudy and rainy days.

Figure 8. Comparison of actual and predicted values for rainy days.

Table 1. Variables of the dataset.

Variable	Definition	Unit
Active Power (AP)	Sampled at 10 s intervals and average power over five min.	kW
Weather Temperature Celsius (WTC)	Sampled at 10 s intervals and averaged over five min.	°C
Weather Relative Humidity (WRH)	Sampled at 10 s intervals and average power over five min.	%
Global Horizontal Radiation (GHR)	Intensity of solar power received by a horizontal plane at the surface of the Earth.	W/m²
Diffuse Horizontal Radiation (DHR)	Light that has been scattered by atmospheric particles not in the line of direct radiation from the sun.	W/m²
Wind Direction (WD)	Sampled at 10 s intervals and averaged over 5 min.	°
Weather Daily Rainfall (WDR)	Sampled at 10 s intervals and averaged over 5 min.	mm
Wind Speed (WS)	Sampled at 10 s intervals and averaged over 5 min.	m/s²

Table 2. Parameter settings.

Parameter Name	Parameter Value
Number of convolutional layers	3
Number of convolutional kernels	7\5\3
Encoder stack	3\2\1
Batch size	128
Dropout	0.01
Optimizer	Adam
Epoch	50
Initial learning rate	0.0001
Activation function	Relu
Loss function	MSE

Table 3. Error comparison of different parameters.

Model	Size of Convolution Kernel Convolutional Layer 1	Size of Convolution Kernel Convolutional Layer 2	Size of Convolution Kernel Convolutional Layer 3	MAE/kW	MAPE	RMSE/kW
Model	Number of Convolution Kernel (7)	Number of Convolution Kernel (5)	Number of Convolution Kernel (3)	MAE/kW	MAPE	RMSE/kW
CNN–informer	3 × 3	4 × 4	5 × 5	4.457	0.124	4.786
CNN–informer	4 × 4	5 × 5	6 × 6	3.846	0.097	4.254
CNN–informer	5 × 5	6 × 6	7 × 7	4.275	0.149	4.952
CNN–informer	6 × 6	7 × 7	8 × 8	5.348	0.192	5.264

Table 4. Error comparison of different prediction models under different weather types.

Weather	Model	MAE/kW	MAPE	RMSE/kW
Sunny Day	BP	1.247	0.048	1.714
	CNN	1.056	0.037	1.632
	CNN–LSTM	1.095	0.030	1.658
	CNN–informer	0.821	0.024	1.589
Cloudy Day	BP	1.379	0.053	2.412
	CNN	1.179	0.049	2.379
	CNN–LSTM	1.208	0.045	2.353
	CNN–informer	1.024	0.037	2.256
Cloudy and Rainy Day	BP	2.016	0.094	4.012
	CNN	1.756	0.071	3.967
	CNN–LSTM	1.582	0.062	3.954
	CNN–informer	1.247	0.051	3.712
Rainy Day	BP	2.875	0.206	5.245
	CNN	2.697	0.174	5.165
	CNN–LSTM	2.547	0.185	5.147
	CNN–informer	1.785	0.115	4.958

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, Z.; Pan, F.; Li, D.; He, H.; Zhang, T.; Yang, S. Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network. Sustainability 2022, 14, 13022. https://doi.org/10.3390/su142013022

AMA Style

Wu Z, Pan F, Li D, He H, Zhang T, Yang S. Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network. Sustainability. 2022; 14(20):13022. https://doi.org/10.3390/su142013022

Chicago/Turabian Style

Wu, Ze, Feifan Pan, Dandan Li, Hao He, Tiancheng Zhang, and Shuyun Yang. 2022. "Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network" Sustainability 14, no. 20: 13022. https://doi.org/10.3390/su142013022

APA Style

Wu, Z., Pan, F., Li, D., He, H., Zhang, T., & Yang, S. (2022). Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network. Sustainability, 14(20), 13022. https://doi.org/10.3390/su142013022

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Photovoltaic Power by the Informer Model Based on Convolutional Neural Network

Abstract

1. Introduction

2. Methodology

2.1. Correlation Analysis

2.2. Similar Day Clustering

2.3. Convolutional Neural Network Module

2.4. Informer Model

2.5. CNN–Informer Photovoltaic Power Prediction Model

3. Framework of the Proposed Methodology

4. Experiment and Results

4.1. Dataset Introduction

4.1.1. Data Preprocessing

4.1.2. Dataset Partitioning

4.2. Model Evaluation Metrics

4.3. Input Variable Selection

4.4. Parameter Settings

4.5. Comparison and Analysis of Prediction Results

4.6. Error Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI