Enhancing photovoltaic power forecasting precision is constrained by meteorological instability and inaccuracies in numerical weather prediction (NWP). To address this limitation, we propose a novel predictable element reconstruction technique termed CISSD, which efficiently extracts forecastable components with reduced computational overhead. This method decomposes NWP irradiance data into distinct predictable elements. Furthermore, an ETO-driven optimization mechanism is developed to mitigate hyperparameter sensitivity during decomposition, thereby refining CISSD’s parameter configuration. Based on the extraction of predictable components of irradiance, the historical measured irradiance and PV power are reconstructed to obtain fully predictable PV components by associating them with predictable components. A combined learning framework based on a spatiotemporal heterogeneous graph neural network and Ns-Transformer is proposed to achieve the prediction of predictable and fluctuating components of PV.
2.1. Predictable Component Extraction Based on CISSD
Circulant singular spectrum decomposition [
53] is an improved decomposition method based on singular spectral decomposition (SSD). Distinguished from SSD, CISSD is able to obtain the components of the ideal frequency with high efficiency in the decomposition process, and its whole process is still divided into four steps:
- (1)
Embedding: this step is the same as SSD.
- (2)
Decomposition: compute the cyclic matrix , find the eigenvalue of , and associate the kth eigenvalue and the corresponding eigenvector with the frequency . Where can be calculated by the element corresponding to the following relation:
where
can be expressed as follows:
- (3)
Grouping: taking into account the symmetry of the power spectral density, it is obtained that . Their corresponding eigenvectors are complex; therefore, they are conjugate complex pairs, . denotes the complex conjugate of the vectors, and and correspond to the same harmonic period, which are transformed into pairs of real eigenvectors in order to compute the associated components. In order to form the elementary matrices, it is necessary to first form the groups of the two elements and . Secondly, we compute the fundamental matrix in terms of the frequency as a sum of the two fundamental matrices and , related to the eigenvalues and and the frequency .
where
denotes the real part of
,
denotes its imaginary part, and
denotes the conjugate transpose of a vector
. Note that the matrix
is real.
- (4)
Reconstruction: this step is the same as SSD. Based on the reconstruction link , forecast irradiance sequence components with different frequencies are obtained:
where
denotes the value at the
nth time point of the
mth component after decomposition,
denotes the number of sequences after decomposition, and
denotes the length of the sequence.
To mitigate CISSD hyperparameter sensitivity during decomposition, we implement the Exponential Triangular Optimization (ETO) algorithm [
54]. This approach optimizes the window length parameter of CISSD, enhancing decomposition efficacy. Subsequent to optimization, predictable elements and high-frequency oscillatory components are extracted. The fitness function governing ETO integrates permutation entropy (Pe) [
55,
56,
57] and correlation metrics, where minimized Pe in low-frequency predictable components coupled with maximized correlation to the source signal indicates: (1) reduced component complexity, (2) elevated predictability, and (3) preserved signal integrity. Therefore, the fitness function is constructed as the summation of the modified correlation coefficient and Pe between predictable components and the original signal, formalized as follows:
where,
denotes the fitness function,
and
denote the original sequence and the decomposed predictable component sequence, respectively,
and
denote the mean value of the original sequence and the decomposed predictable component sequence, respectively,
denotes the correlation coefficient modification operator,
denotes the formula of Pe,
denotes the length of the sequence, and
denotes the correlation coefficient.
2.2. Reconstruction of the Predictable Component of the Correlation of Weather/Power
The trend component and high-frequency fluctuation component of the NWP irradiance sequence are obtained on the basis of Equation (4), and generally after the decomposition, the first component can be used as the trend component of the irradiance sequence, which is used in this paper as the predictable component for the historical real irradiance sequence as well as the basis for the reconstruction of the predictable component of the PV power sequence. It is assumed that the predictable component sequence of the forecasted irradiance series is
, and its fluctuation component sequence is
, whose sum represents the sequence before the original decomposition. Based on this, the predictable component for the historical PV power sequence can be obtained with the following expression:
where
,
, and
denote the
ith reconstructed historical predictable PV power point, the predictable component of the forecasted irradiance sequence, and the PV power point before reconstruction, respectively. The construction method is mainly to take the product of the point-by-point ratio of the predictable component of the forecasted irradiance sequence to the total irradiance and the historical total PV power sum as the reconstructed PV power predictable component, and since the reconstruction process fully satisfies the linear relationship, the PV power predictable component is able to be predicted completely and accurately. Therefore, the fluctuation component of power can be expressed as follows:
where
denotes the
ith reconstructed historical fluctuating PV power point.
As the predictable component of power is extracted, the remaining fluctuation component needs to be predicted based on another part of the fluctuation input feature, this paper introduces the historical measured irradiance sequence as the source of input information, constructs the predictable and fluctuation components based on historical measured irradiance based on predictable input
, and predicts the fluctuation component of PV power based on the method of time series prediction. Since the predictable component of NWP irradiance can completely and effectively predict the predictable component of PV power, it is necessary to use the same method to extract the predictable component in the sequence of measured irradiance and use the remaining component as the basis for the prediction of the fluctuation component of PV power. As a result, in this paper, the predictable component of the historical measured irradiance series is directly replaced by the NWP irradiance predictable component with the following expression:
where
denotes the predictable component point of the
ith reconstructed historical measured irradiance series. On this basis, an expression for the calculation of its fluctuation component can be obtained:
where
and
denote the
ith reconstructed historical measured irradiance series and its fluctuation component points, respectively.
On this basis, expressions for the prediction relationships for the predictable and unpredictable components can be obtained as follows:
where
indicates that
is used as the model input to predict
. Therefore, the predictable component of NWP irradiance is used as the input to predict the predictable component of PV power in the prediction stage, and the fluctuation component of historical measured irradiance and the historical fluctuation component of PV power are used as the inputs to predict the PV power fluctuation component in the time period to be predicted.
In order to enhance the predictability of the fluctuation component, this paper adopts CISSD to decompose both
as well as
, and constructs the prediction model for different components; thus, Equation (13) can be expressed as follows:
where
, as well as
, denote the
ith value corresponding to the
and
components of the
jth decomposition, respectively.
2.3. Combined Multicomponent Prediction Based on STHGNN-Ns-Transformer
This paper specifies the prediction input and output mechanisms for the PV power predictable component as well as the fluctuation component. In this regard, a combined framework based on Ns-Transformer [
58] and a heterogeneous spatiotemporal graph neural network is proposed for the prediction of the PV power predictable component as well as multiple fluctuation sub-components. This combination aims to effectively capture the dynamic spatial dependencies between nodes and the temporal dependencies of the nodes’ own evolution over time in complex spatiotemporal data, which is the key to handling the PV power prediction task. This combined model is mainly used to learn a mapping function
F that is capable of predicting PV power based on a history of
T time steps of graph-structured spatiotemporal data X = [
X1,
X2, …,
Xt] ∈ R
{N×T×C} (where
N is the number of nodes,
T is the historical time step, and
C is the node feature dimension, as in
Figure 1), and predicts that the future
T′ time steps of data
Y = [
Y1,
Y2, …,
Yt′] ∈ R
{N×T′×C′} (
C′ may be different from
C). The powerful spatial relationship modelling capability of STHGNN is combined with the efficient and flexible temporal dependency modelling capability of Ns-Transformer. Usually adopts a ‘space first, time second’ or ‘space-time interleaved’ architectural strategy.
(1) STHGNN: This paper needs to predict different components of PV power simultaneously and achieve effective prediction of multiple types of components. In the spatial modelling session, the joint prediction of multiple types of components is mainly involved. Hence, in order to improve the learning ability of the model, this paper uses a heterogeneous spatiotemporal graph neural network to model this process. The graph convolution is performed independently at each time step
t to aggregate the neighbourhood information:
where
denotes the input features at time step
t,
denotes the scaled graph Laplace matrix,
denotes the Chebyshev polynomials that approximate the spectral domain convolution kernel,
denotes the learnable parameters, and σ denotes the activation function (e.g., ReLU). Two temporal modules stacked with one spatial module are used to form a spatiotemporal map convolutional layer for spatiotemporal feature extraction. The historical PV predictable component, irradiance predictable component, and fluctuation component are used as inputs to the temporal feature extraction module for temporal convolution, and the weighted adjacency matrix of each PV component is used as input to the spatial feature extraction module for spatial convolution. Each PV component is represented as a node, and the correlation between the nodes is represented by connecting lines with weight values. The PV component at the moment
t can be represented by the spatiotemporal graph as follows:
where
denotes the set of power generation data of all nodes in the spatiotemporal graph at time
t,
denotes the weighted adjacency matrix of the connecting lines in the spatiotemporal graph, and
denotes the weight coefficient between node
and node
. A graph convolution network is used to extract the spatial features of each node in the PV component topology graph, while a normalised Laplace matrix is used in the spectral domain to define the PV component graph structure as follows:
where
I denotes the unit matrix,
E denotes the weighted adjacency matrix of the PV plant power map structure, and
D denotes the degree matrix of the PV plant power map structure.
Eigen-decomposition of the normalised Laplace matrix can be obtained as the matrix consisting of the eigenvectors of
.
is the eigenvalue matrix consisting of the eigenvalues of
L. Simplifying the arithmetic operation of graph convolution, the graph convolution formula can be obtained as follows:
where
denotes the graph convolution neural network input, i.e., the output of the time convolution module, g denotes the convolution kernel, ☉ denotes the Hadamard product. Using
=
as the convolution kernel and
, the graph convolution formula is obtained as follows:
(2) Constructing spatiotemporal heterogeneous graphs: in order to effectively capture the evolutionary relationships such as correlation, fluctuation synchrony, and fluctuation magnitude among different PV power components, on the basis of STHGNN, this paper adopts a multi-type heterogeneous maps neural network to model the relationships among PV power components. Among them, the spatiotemporal graph of STHGNN is a ternary group:
where
is the heterogeneous spatiotemporal map of the STHGNN,
denotes the correlation adjacency matrix describing the component evolution,
denotes the fluctuation synchronisation adjacency matrix describing the component evolution, and
denotes the size proximity adjacency matrix describing the component evolution.
In order to reasonably characterize the functions of the three kinds of adjacency matrices, this paper adopts different calculation guidelines to describe these connection relationships, respectively. For
, this paper adopts the correlation coefficient to calculate the connection relationship in the adjacency matrix, and adopts the correlation strength to quantitatively describe the size of the connection weight, and the calculation formula is as follows:
where
denotes the weight of the correlation adjacency matrix between the target node
and the reference node
,
and
denote the sequence of the
ith target node and the sequence of the
jth reference node, respectively, and
denotes the number of system nodes.
In order to describe the synchronisation of fluctuations between nodes, this paper adopts the minimum delay correlation to describe this connectivity relationship. Firstly, the delay correlation matrix between the target node
and reference node
is calculated:
where
denotes the correlation between the target node
and reference node
after delaying
time steps, and
denotes the delay time step threshold. After this calculation, the delay correlation matrix of
target nodes
and reference node
can be obtained, and finally, the optimal delay step matrix between the target node
and reference node
can be obtained:
where
denotes the optimal delay step matrix between the target node
and the reference node
. When the delay step is larger, it means that the fluctuation synchronisation between the nodes is weaker and vice versa, based on which the neighbourhood matrix weight
between the target node
and reference node
can be calculated as follows.
On this basis, in order to obtain the correlation of the size of fluctuations between the describing nodes, this paper adopts the average absolute amplitude difference to describe, when the average absolute amplitude difference between the target node
i and the reference node
j is larger, it means that the degree of fluctuation between the two components of the similarity of the degree of fluctuation is lower, and vice versa, it is larger, and its calculation method is as follows:
where
denotes the fluctuation size similarity between the target node
i and the reference node
j,
denotes the length of the reference node and the target node used to compute the similarity, and
and
denote the
kth value of the target node
i and the reference node
j, respectively. The final computationally obtained heterogeneous graph matrix describing the correlation, fluctuation, synchronisation, and size similarity between nodes
g.
(3) Ns-Transformer: In order to further integrate the spatiotemporal features of STHGNN, this paper mainly adopts a non-smooth Transformer to further extract the spatiotemporal features and effectively extracts the non-smooth information from these features to achieve the high-precision prediction of PV fluctuation power. The common Transformer adopts the traditional location coding and ignores the inter-node heterogeneity, while Ns-Transformer learns the unique location coding vector for each node
i:
where
denotes the feature of node
i at time
t, which is derived from the output of the STHGNN, and
denotes the trainable parameter, which portrays the intrinsic temporal pattern of node
i. Due to the high complexity of the traditional standard self-attention, Ns-Transformer introduces a local window mask,
, and employs a sparse self-attention mechanism to solve this problem as follows:
where
, if and only if
. The complexity is significantly reduced after this operation, supporting long sequence processing to preserve local key dependencies while suppressing noisy associations.
The main research technique framework of this paper is as follows:
- (1)
Extraction of predictable components of NWP irradiance sequence: The historical NWP irradiance sequence is decomposed using the CISSD decomposition method based on ETO optimization to extract its predictable components and fluctuation sequence.
- (2)
Reconstruction based on historical real irradiance sequence and power sequence: Based on the predictable component of NWP irradiance obtained in step (1), the predictable and fluctuating components of the historical power sequence as well as the predictable and fluctuating components of the historical real irradiance sequence are obtained according to Equations (4)–(14), respectively.
- (3)
PV power prediction based on an STHGNN-Ns-Transformer combined architecture: for the PV power components extracted in steps (1) and (2), a combined spatiotemporal heterogeneous graphical neural network combined with Ns-Transformer is used to model the spatiotemporal evolution and dependency between different components, and ultimately achieve high-precision prediction of PV power. The overall research framework is shown in
Figure 2.