Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression

Wang, Guanghu; Zhou, Yan; Yan, Yan; Zhou, Zhihan; Yang, Zikang; Dai, Litao; Huang, Junpeng

doi:10.3390/su18020739

Open AccessArticle

Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression

by

Guanghu Wang

¹,

Yan Zhou

^1,*,

Yan Yan

²,

Zhihan Zhou

³,

Zikang Yang

¹,

Litao Dai

¹ and

Junpeng Huang

¹

School of Electronic Engineering, Jiangsu Ocean University, Lianyungang 222005, China

²

State Grid Ningxia Electric Power Research Institute, Yinchuan 750011, China

³

Makarov College of Marine Engineering, Jiangsu Ocean University, Lianyungang 222005, China

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(2), 739; https://doi.org/10.3390/su18020739

Submission received: 24 November 2025 / Revised: 9 January 2026 / Accepted: 9 January 2026 / Published: 11 January 2026

(This article belongs to the Topic Sustainable Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

Accurate probabilistic forecasting of photovoltaic (PV) power generation is crucial for grid scheduling and renewable energy integration. However, existing approaches often produce prediction intervals with limited calibration accuracy, and the interdependence among meteorological variables is frequently overlooked. This study proposes a probabilistic forecasting framework based on a Multi-scale Temporal–Spatial Attention Quantile Regression Network (MTSA-QRN) and an adaptive calibration mechanism to enhance uncertainty quantification and ensure statistically reliable prediction intervals. The framework employs a dual-pathway architecture: a temporal pathway combining Temporal Convolutional Networks (TCN) and multi-head self-attention to capture hierarchical temporal dependencies, and a spatial pathway based on Graph Attention Networks (GAT) to model nonlinear meteorological correlations. A learnable gated fusion mechanism adaptively integrates temporal–spatial representations, and weather-adaptive modules enhance robustness under diverse atmospheric conditions. Multi-quantile prediction intervals are calibrated using conformalized quantile regression to ensure reliable uncertainty coverage. Experiments on a real-world PV dataset (15 min resolution) demonstrate that the proposed method offers more accurate and sharper uncertainty estimates than competitive benchmarks, supporting risk-aware operational decision-making in power systems. Quantitative evaluation on a real-world 40 MW photovoltaic plant demonstrates that the proposed MTSA-QRN achieves a CRPS of 0.0400 before calibration, representing an improvement of over 55% compared with representative deep learning baselines such as Quantile-GRU, Quantile-LSTM, and Quantile-Transformer. After adaptive calibration, the proposed method attains a reliable empirical coverage close to the nominal level (PICP₉₀ = 0.9053), indicating effective uncertainty calibration. Although the calibrated prediction intervals become wider, the model maintains a competitive CRPS value (0.0453), striking a favorable balance between reliability and probabilistic accuracy. These results demonstrate the effectiveness of the proposed framework for reliable probabilistic photovoltaic power forecasting.

Keywords:

photovoltaic power forecasting; probabilistic prediction; multi-scale temporal-spatial attention; conformalized quantile regression; uncertainty quantification; graph attention networks

1. Introduction

Solar PV energy is widely regarded as one of the most promising renewable energy alternatives owing to its environmental sustainability, technological maturity, and ubiquitous availability [1]. PV systems have consequently become an essential component of sustainable energy development worldwide [2]. According to the International Energy Agency, the global cumulative installed PV capacity reached 2.2 TW by the end of 2024, with 602 GW of new installations added in that year alone, representing a year-on-year growth of 32% [3]. China, as the world’s largest PV market, achieved 887 GW of total installed capacity by the end of 2024, accounting for approximately 40% of global capacity, with a record-breaking 277 GW added in 2024 [4]. With annual capacity growth rates exceeding 30% in recent years, the solar photovoltaic industry has entered a phase of rapid expansion.

To support reliable grid operation, accurate short-term PV forecasting has become increasingly important. Existing forecasting approaches can generally be categorized into physical models, statistical learning methods, and hybrid frameworks [5]. Physical models depend on numerical weather predictions but are sensitive to parameter uncertainties [6,7]. Data-driven and deep learning models, including CNNs, LSTMs, and attention architectures, have demonstrated competitive performance by capturing nonlinear temporal patterns [8,9,10,11,12,13,14,15,16]. In recent years, probabilistic forecasting has attracted growing attention due to its ability to quantify forecast uncertainty via prediction intervals or distributions, thereby enabling risk-aware decision-making in power system operations [17,18,19,20]. Quantile regression is widely adopted due to its distribution-free nature [21,22,23,24,25], yet deep quantile models often exhibit miscalibrated interval coverage under changing weather conditions [26,27]. Conformalized Quantile Regression (CQR) provides distribution-free statistical coverage [28], but efficient calibration strategies for large-scale PV forecasting remain underexplored.

Meanwhile, advances in remote sensing technologies have provided critical support for solar irradiance estimation and PV forecasting. Satellite-derived irradiance and meteorological products offer wide spatial coverage and effectively capture cloud-driven rapid variability, which is essential for improving the credibility of probabilistic forecasts. Consequently, the integration of deep learning architectures with uncertainty-aware modeling using remote-sensing data has emerged as an important research direction, aiming to enhance predictive robustness in high-penetration renewable energy systems [29,30,31,32,33].

Despite the progress achieved to date, two key challenges remain. First, PV generation exhibits complex multi-scale temporal dynamics ranging from second-level fluctuations to seasonal patterns due to cloud movement, diurnal cycles, and synoptic-scale changes [34,35,36,37]. Existing models struggle to jointly capture these hierarchical dependencies in an end-to-end manner. Second, forecasting accuracy is strongly influenced by nonlinear interdependencies among meteorological variables such as irradiance, temperature, and wind speed [38,39,40,41]. Although attention mechanisms enable adaptive temporal modeling [42,43], spatial correlations among meteorological features are often overlooked [44,45]. Graph Attention Networks (GATs) can infer dynamic dependency structures [46,47], yet their integration into probabilistic PV forecasting remains insufficiently explored under varying weather conditions [48,49].

To provide a systematic summary of the above discussion, Table 1 compares typical photovoltaic power forecasting approaches based on their technical routes.

Beyond the specific domain of PV forecasting, recent studies have significantly advanced probabilistic and spatiotemporal modeling in related time-series fields. For instance, in power load forecasting, a non-crossing sparse-group Lasso-quantile regression deep neural network has been proposed to effectively address the quantile crossing problem through regularization [50]. Similarly, regarding complex spatiotemporal dynamics, interactive attention-based deep networks have been successfully applied to remaining useful life prediction by fusing multisource information [51].

However, compared with these existing architectures, the specific characteristics of photovoltaic power generation require specialized structural designs to address remaining challenges. First, PV generation exhibits distinct multi-scale temporal dynamics (e.g., high-frequency cloud transients vs. low-frequency diurnal cycles) that differ from general load or equipment degradation patterns, requiring explicit disentanglement. Second, standard interactive attention mechanisms often lack the ability to dynamically adapt to varying atmospheric regimes (e.g., Sunny vs. Rainy), which is essential for accurately modeling meteorological interactions in PV systems. Finally, to ensure statistically valid prediction intervals under extreme weather shifts, an adaptive calibration strategy is necessary beyond standard regularization techniques.

Given that PV power generation is jointly influenced by multi-scale temporal dynamics and spatially correlated meteorological disturbances, existing approaches still face limitations in accurately capturing these complex behaviors and reliably quantifying forecasting uncertainty. To address these challenges, this paper proposes a MTSA-QRN for short-term probabilistic PV power forecasting. First, dilated convolutions and multi-head self-attention are jointly employed to model hierarchical temporal dependencies, enhancing the simultaneous characterization of high-frequency cloud-induced fluctuations and low-frequency diurnal evolution. Second, an adaptive Graph Attention Network is introduced to explicitly capture nonlinear interactions among meteorological variables, enabling more responsive modeling of weather-driven volatility. Moreover, a weather-adaptive fusion mechanism is designed to dynamically adjust the contribution of temporal and spatial features under different atmospheric regimes, improving robustness across diverse meteorological conditions. Finally, an adaptive conformalized quantile regression strategy is incorporated to calibrate prediction intervals, ensuring statistically reliable coverage performance at multiple confidence levels. The main contributions of this study are summarized as follows:

A unified temporal–spatial probabilistic forecasting framework is developed by systematically coordinating multi-scale temporal modeling and spatial dependency learning. Specifically, dilated TCN and multi-head self-attention are jointly employed to capture hierarchical temporal dependencies, while GAT-based modeling is used to characterize nonlinear meteorological interactions. Rather than introducing new standalone modules, this design enables a more comprehensive representation of PV generation dynamics under multi-source disturbances, leading to improved predictive performance in complex scenarios, as validated by comparative experiments.
To address regime-dependent photovoltaic power behaviors, a weather-related feature fusion strategy is designed to adaptively adjust the contributions of temporal and spatial representations. This mechanism allows the model to preserve stable trend information under relatively smooth conditions while remaining sensitive to pronounced power fluctuations, thereby alleviating the performance degradation commonly caused by fixed fusion schemes and improving generalization across varying meteorological situations.
An adaptive conformalized quantile regression (CQR) calibration strategy is incorporated to balance prediction interval coverage reliability and interval sharpness. By iteratively adjusting calibration parameters in a data-driven manner, the proposed approach avoids overly conservative interval widening and yields more informative uncertainty estimates. Experimental results demonstrate that this calibration strategy provides consistent coverage performance while maintaining practical usefulness for risk-aware decision-making.

The remainder of this paper is organized as follows. Section 2 presents the theoretical background underpinning the proposed methodology, including multi-scale decomposition, attention mechanisms, and probabilistic modeling formulations. Section 3 introduces the MTSA-QRN model in detail, covering temporal–spatial representation learning, weather-adaptive feature fusion, and adaptive quantile calibration. Section 4 reports the experimental evaluation conducted on real PV operational data and discusses comparative probabilistic forecasting performance against representative baselines. Finally, Section 5 concludes the study and outlines future research directions.

2. Methodological Background

Reliable probabilistic PV forecasting requires modeling temporal dependencies, meteorological feature interactions, and statistically valid uncertainty quantification. This section briefly introduces the theoretical foundations supporting the proposed temporal–spatial attention architecture and adaptive calibration strategy, while Section 3 elaborates the core innovations.

2.1. Temporal Dependency Modeling Using Attention Mechanisms

Photovoltaic power time series are highly non-stationary due to cloud dynamics and atmospheric instability, which cause rapid changes in irradiance and strong short-term fluctuations in power output. To adaptively capture temporal dependencies under varying meteorological conditions, attention mechanisms dynamically emphasize the most informative timestamps and features.

Let

X \in R^{n \times d}

denote the input feature matrix, where

n

denotes the number of time steps and

d

denotes the feature dimension including historical PV power and meteorological variables such as irradiance, temperature, humidity, and wind speed. The query (Q), key (K), and value (V) representations are obtained through three learnable projection matrices

W^{Q}

,

W^{K}

,

W^{V} \in R^{d \times d_{k}}

:

\{\begin{cases} Q = X W^{Q} \\ K = X W^{K} \\ V = X W^{V} \end{cases}

(1)

In this formulation,

Q

represents the direction of attention that the model aims to focus on at a given time,

K

encodes the contextual information provided by each temporal or meteorological feature, and

V

carries the semantic or numerical content to be aggregated according to the learned attention weights. The similarity between each query and key determines the attention scores, which are normalized using the softmax function to ensure that the attention weights sum to one. The resulting self-attention output is given by

A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(2)

Here, the scaling factor

\sqrt{d_{k}}

prevents the dot-product values from becoming excessively large, which could lead to vanishing gradients. This mechanism allows the network to dynamically adjust the relative importance of each time step or meteorological feature, thereby enabling adaptive modeling of temporal dependencies and spatial correlations under varying weather conditions. Such flexibility is crucial for PV forecasting, where the relationship between irradiance and power generation can fluctuate rapidly.

Building upon this foundation, the multi-head attention mechanism extends self-attention by performing multiple parallel attention computations in different representation subspaces. This design allows the model to capture diverse patterns and dependencies simultaneously. The multi-head attention is formulated as

\{\begin{cases} M u l t i H e a d (Q, K, V) = C o n c a t (h e a d_{1}, \dots, h e a d_{h}) W^{O} \\ h e a d_{i} = A t t e n t i o n (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V}) \end{cases}

(3)

To clarify the projection and mapping relationships in Equation (3), Figure 1 illustrates the complete multi-head attention mechanism. Given the input feature matrix

X

, linear transformations parameterized by

W^{Q}

,

W^{K}

, and

W^{V}

are applied to project the input into query (

Q

), key (

K

), and value (

V

) representations. These representations are further mapped into multiple subspaces corresponding to different attention heads. Each head performs scaled dot-product attention (Equation (2)) in a lower-dimensional subspace, typically with dimensionality

d_{k} = \frac{d}{h}

. The outputs from all attention heads are then concatenated and passed through an output projection matrix

W^{o}

to restore the original feature dimensionality.

Each attention head operates in a distinct subspace, enabling the network to focus on different aspects of temporal and meteorological dependencies. Some heads may specialize in short-term fluctuations, such as those caused by rapid cloud movements, while others capture long-term patterns such as diurnal and seasonal trends. Through this multi-head structure, the model achieves a more comprehensive understanding of the complex nonlinear relationships in PV power time series, effectively integrating both fine-grained dynamics and global temporal patterns.

This attention-based temporal modeling provides the foundation for the temporal pathway enhancement introduced in Section 3.1.

2.2. Graph Attention for Meteorological Interaction Representation

PV generation is influenced by multiple meteorological variables whose nonlinear relationships evolve across weather conditions. Graph Attention Networks (GATs) enable adaptive modeling of such feature interactions. Let

h_{i}

denote the representation of the

i - t h

meteorological variable, and

N (i)

represents its neighborhood in the constructed feature graph. A learnable vector

a

and weight matrix

W

are used to compute the attention coefficient

α_{i j}

between variables

i

and

j

:

α_{i j} = \frac{\exp (σ (a^{T} [W h_{i} | | W h_{j}]))}{\sum_{k \in N (i)} \exp (σ (a^{T} [W h_{i} | | W h_{k}]))}

(4)

The updated feature embedding for variable

i

is obtained by aggregating contextual information from its neighbors as

h_{i}^{'} = \sum_{j \in N (i)} α_{i j} W h_{j}

(5)

This mechanism provides the spatial dependency modeling basis for the fusion strategy introduced in Section 3.2.

2.3. Quantile Regression and Calibration for Uncertainty Modeling

To explicitly quantify predictive uncertainty in PV power forecasting, this study adopts a quantile regression framework, which estimates conditional quantiles of the target variable without assuming an explicit parametric form of the conditional output distribution. Unlike deterministic point forecasting, uncertainty is represented through multiple quantile-dependent outputs that characterize the conditional distribution of PV power given the input features.

For a quantile level

τ \in (0, 1)

, the prediction function

{\hat{y}}_{τ} (X_{t})

is obtained by minimizing the asymmetric pinball loss

L_{τ} (\hat{y}, y) = \max (τ (y - \hat{y}), (τ - 1) (y - \hat{y}))

(6)

where

y

denotes the observed PV power and

{\hat{y}}_{τ}

represents the predicted value at quantile level

τ

. Different quantile levels capture different parts of the conditional distribution, thereby explicitly reflecting predictive uncertainty.

Based on the estimated quantiles, prediction intervals can be constructed by combining lower and upper estimated quantiles

{\hat{y}}_{τ_{l}}

and

{\hat{y}}_{τ_{u}}

, expressed as

P I_{t} = [{\hat{y}}_{τ_{l}}, {\hat{y}}_{τ_{u}}]

(7)

Here,

τ_{l}

and

τ_{u}

represent the lower and upper quantile levels, respectively, and the resulting interval

P I_{t}

provides an interpretable measure of forecast uncertainty at a specified confidence level.

However, distribution shifts and finite-sample effects may cause predicted intervals to deviate from nominal confidence levels. To improve coverage reliability, conformal calibration introduces an adaptive adjustment term

η

to both bounds:

C (X_{t}) = [{\hat{y}}_{τ_{l}} (X_{t}) - η, {\hat{y}}_{τ_{u}} (X_{t}) + η]

(8)

In this formulation,

η

represents the calibration parameter learned from calibration data, which compensates for systematic under-coverage or over-coverage of the raw quantile-based intervals. The calibrated interval

C (X_{t})

thus provides a more reliable uncertainty estimate while preserving the distribution-free property of quantile regression. This formulation supports the development of the adaptive calibration strategy expanded in Section 3.4.

3. Proposed Methodology

The MTSA-QRN is designed to learn multi-scale dynamic variations in PV power sequences while capturing complex meteorological dependencies that evolve under varying weather conditions. The overall framework is illustrated in Figure 2. Historical PV power data and meteorological variables are separately processed through temporal and spatial learning pathways. The temporal pathway extracts both transient and evolving patterns via multi-scale convolutional structures, while the spatial pathway employs graph attention to represent nonlinear coupling among meteorological drivers such as irradiance attenuation and wind-induced cooling. The two representations are further guided by a latent weather-state representation to enhance predictive robustness under sunny, cloudy, and rainy conditions. Finally, the unified spatiotemporal embeddings are fed into a quantile prediction module and calibrated to ensure statistically reliable uncertainty quantification for operational decision-making.

To further clarify how temporal and meteorological dependencies interact within the model, the core feature learning mechanism is additionally illustrated in Figure 3, where the TCN-based temporal pathway and the GAT-based spatial pathway are jointly modulated through a weather-adaptive fusion strategy. This mechanism dynamically adjusts the contribution of heterogeneous features under varying atmospheric disturbances, constructing weather-aware representations that enhance robustness against prediction performance degradation.

Through this coordinated architecture, MTSA-QRN effectively integrates multi-scale temporal dependency modeling, meteorological interaction enhancement, weather-adaptive refinement, and reliability-aware quantile estimation. These four innovations collectively enhance forecasting performance in high-uncertainty environments such as severe cloud instability and sudden irradiance fluctuations. Detailed methodological advancements are presented in Section 3.1, Section 3.2, Section 3.3 and Section 3.4.

3.1. Multi-Scale Temporal-Attention Dependency Enhancement

PV power output exhibits highly non-stationary temporal dynamics with different fluctuation mechanisms operating at distinct time scales, which are difficult to jointly capture using conventional attention modules. To address this limitation, a multi-scale temporal-attention dependency enhancement mechanism is introduced. Specifically, temporal representations are extracted across multiple receptive fields using dilated convolutions with different dilation factors to separate cloud-driven short-term variability from diurnal trend evolution. These multi-scale temporal features form complementary temporal dependencies that cannot be learned at a single resolution.

To further refine temporal focus, an attention-based temporal gating mechanism is applied to dynamically weight the multi-scale representations. Given multi-scale features

{H_{1}, H_{2}, \dots, H_{L}}

, temporal attention weights

{α_{1}, α_{2}, \dots, α_{L}}

are computed as

α_{l} = \frac{\exp (f (H_{l}))}{\sum_{j = 1}^{L} \exp (f (H_{j}))}, l = 1, 2, \dots, L

(9)

where

f (\cdot)

is a learnable scoring function measuring temporal relevance under current meteorological input. The enhanced temporal representation is derived as

H_{e n h} = \sum_{l = 1}^{L} α_{l} H_{l}

(10)

Thus, transient disturbances such as cloud transients receive higher contributions during unstable weather periods, whereas low-frequency trends dominate in stable conditions. This formulation establishes an adaptive mechanism that selectively attends to different temporal scales as meteorological dynamics shift, providing enhanced temporal generalization not achievable by isolated TCN or standard self-attention architectures.

This formulation enables the attention mechanism to automatically increase the contribution of short-term fluctuations during highly unstable weather periods, while assigning greater importance to low-frequency variations in clear and stable conditions. Consequently, the network establishes an adaptive multi-scale temporal focus aligned with evolving atmospheric dynamics—significantly improving temporal generalization beyond isolated TCN or standard attention architectures.

From the perspective of photovoltaic power forecasting, the above optimization process plays a clear functional role at different stages of temporal prediction. The multi-scale temporal representations extracted by dilated convolutions provide complementary views of short-term fluctuations and long-term trends in PV power output. Subsequently, the attention-based temporal gating mechanism adaptively reweights these representations, allowing the model to emphasize informative temporal components given prevailing meteorological conditions. As a result, the enhanced temporal representation serves as an optimized feature embedding that directly supports subsequent forecasting stages, improving both temporal generalization and predictive robustness.

To further illustrate the refinement process, Figure 4 visualizes the multi-head self-attention refinement module used to inject global contextual awareness into

H_{e n h}

, ensuring more discriminative temporal embedding before passing into the spatiotemporal fusion stage.

3.2. Meteorological Interaction Enhancement via Adaptive Graph Attention

Photovoltaic power generation is driven by multiple meteorological variables whose relationships vary depending on weather conditions. Standard GATs assume static and homogeneous feature correlations, which limits their ability to adapt to dynamic coupling effects induced by clouds, irradiance attenuation, thermal efficiency shifts, and wind-driven cooling.

To address these limitations, an adaptive graph attention mechanism is introduced to model heterogeneous, directional, and weather-sensitive meteorological interactions.

Given meteorological feature embeddings

F = {f_{1}, f_{2}, \dots, f_{M}}, f_{i} \in R^{d_{m}}

(11)

A learnable transformation is applied:

z_{i} = W_{g} f_{i}, W_{g} \in R^{d_{h} \times d_{m}}

(12)

Attention coefficients between variables

i

and

j

are computed as

α_{i j} = \frac{\exp (σ (a_{g}^{T} [z_{i} | | z_{j}]))}{\sum_{k \in N (i)} \exp (σ (a_{g}^{T} [z_{i} | | z_{k}]))}

(13)

where

a_{g}

is a learnable vector,

N (i)

denotes the neighbors of variable

i

, and

σ (\cdot)

is a nonlinear activation function. The updated feature embedding for variable

i

is obtained through context-aware aggregation:

f_{i}^{'} = \sum_{j \in N (i)} α_{i j} z_{j}

(14)

To further enhance adaptability under diverse atmospheric conditions, a latent weather-state modulation is introduced to dynamically refine attention coefficient magnitudes:

α_{i j}^{*} = γ (w) \cdot α_{i j}, γ (w) = s o f t \max (W_{w} w)

(15)

where

w

denotes a global weather representation extracted from the unified spatiotemporal pathway, and

W_{w}

is a learnable transformation matrix. This modulation enables the model to strengthen humidity–cloud coupling under cloudy regimes, emphasize temperature–power efficiency interactions on clear days, and reinforce wind–thermal dissipation relationships during high-wind-speed periods, resulting in more accurate characterization of meteorological-driven uncertainty.

Such flexible spatial relational learning contributes to improved characterization of meteorological-driven variability, especially when environmental conditions deviate from typical patterns.

Finally, the derived condition-aware interaction representation is forwarded to the weather-adaptive feature fusion module detailed in Section 3.3, serving as a refined spatial input for unified spatiotemporal learning.

3.3. Weather-Adaptive Feature Fusion

PV generation behavior exhibits pronounced regime dependence driven by atmospheric variations such as cloud movement, seasonal irradiance shifts, and humidity-induced attenuation. Sunny periods generally produce smooth bell-shaped curves governed by deterministic solar elevation, whereas cloudy and rainy periods are characterized by abrupt power drops and high-frequency volatility. These distinct dynamics are shown in Figure 5, demonstrating that a fixed spatiotemporal fusion strategy cannot maintain forecasting robustness under rapidly evolving weather patterns.

To address these challenges, a weather-adaptive feature fusion mechanism is introduced. Given unified temporal features

H_{e n h}

from Section 3.1 and meteorological interaction features

F^{'} = {f_{1}^{'}, f_{2}^{'}, \dots, f_{M}^{'}}

from Section 3.2, latent weather states are first inferred through a learnable nonlinear transformation:

s = ϕ (W_{s} [H_{e n h} ‖ \sum_{i = 1}^{M} f_{i}^{'}] + b_{s})

(16)

where

| |

denotes concatenation,

ϕ (\cdot)

is a nonlinear activation, and

s \in R^{K}

represents the atmospheric state embedding (e.g., sunny-like vs. cloudy-like vs. rainy-like patterns).

The fusion weights are designed to adapt to evolving atmospheric regimes through a feedback-driven update mechanism. Specifically, the weights respond to latent weather-state representations inferred from spatiotemporal features, enabling the dynamic adjustment of contributions from the temporal and meteorological pathways under different operating conditions.

A soft weighting function is then designed to map weather states to dynamic fusion coefficients:

β = s o f t m a x (W_{β} s + b_{β}) = [β_{T}, β_{M}], β_{T} + β_{M} = 1

(17)

where

β_{T}

and

β_{M}

indicate the contribution of temporal and meteorological pathways, respectively.

The final fused representation is derived as

Z = β_{T} H_{e n h} + β_{M} (\sum_{i = 1}^{M} f_{i}^{'})

(18)

Thus, under stable irradiance conditions, the model emphasizes trend-dominant temporal structures

(β_{T} ↑)

, while under volatile cloud coverage, the fusion prioritizes meteorology-driven disturbance cues

(β_{M} ↑)

.

To further enhance adaptability, a residual modulation branch is introduced to correct distributional deviations caused by severe weather interventions:

Z_{f i n a l} = Z + ψ (s)

(19)

where

ψ (\cdot)

compensates for weather-induced representation bias.

Through this weather-aware fusion strategy, MTSA-QRN achieves:

stronger responsiveness to short-term variability in unstable conditions,
improved trend integrity during clear-sky periods,
enhanced robustness against atmospheric uncertainty propagation.

The refined representation

Z_{f i n a l}

is then supplied to the probabilistic quantile regression and adaptive calibration module in Section 3.4.

3.4. Probabilistic Prediction and Adaptive Calibration

Benefiting from the stability-constrained adaptive fusion mechanism described in Section 3.3, the refined spatiotemporal representations provide a robust foundation for subsequent probabilistic prediction and calibration.

To quantify predictive uncertainty in PV forecasting, neural quantile regression is employed to estimate conditional quantiles without assuming any explicit distributional form. However, coverage deviations often arise due to distributional shifts induced by weather transitions or limited training samples, causing the empirical coverage

\hat{γ}

to diverge from the desired confidence level

γ

. To mitigate such deviations, a coverage-driven adaptive boundary calibration strategy is devised to refine prediction intervals while preserving sharpness.

Given initial prediction intervals

P I_{t}

, the empirical coverage is first evaluated as

\hat{γ} = \frac{1}{N} \sum_{t = 1}^{N} 1 (y_{t} \in P I_{t})

(20)

The coverage bias can be represented by

δ = \hat{γ} - γ

(21)

A dynamic adjustment term

a

is iteratively updated using a golden-ratio-guided rule:

a^{(k + 1)} = a^{(k)} - ν \cdot δ \cdot a^{(k)}, ν = \frac{\sqrt{5} - 1}{2}

(22)

The calibrated interval at iteration

k

is then

C^{(k)} (X_{t}) = [{\hat{y}}_{τ_{l}} (X_{t}) - a^{(k)}, {\hat{y}}_{τ_{u}} (X_{t}) + a^{(k)}]

(23)

This update formulation introduces a contraction property toward the equilibrium boundary

a^{(k)}

, which supports stable convergence of the calibration process even when weather-driven uncertainty fluctuates. As highlighted in Figure 6, the empirical coverage gradually approaches the target level (90% shown here) without aggressive interval expansion, indicating a balance between reliability and informativeness.

To establish a realistic testing scenario, the experiments are conducted using real operational data from a 40 MW grid-connected PV plant in Northern China. Moreover, the adaptive calibration benefits from the enhanced interaction modeling described in Section 3.2 and the weather-aware feature refinement introduced in Section 3.3, which together provide more robust uncertainty structures for interval adjustment. To illustrate this collaborative effect, Table 2 presents ablation results comparing different spatial dependency modeling strategies. The results show that incorporating the adaptive GAT leads to lower CRPS and narrower PINAW while maintaining or improving PICP, indicating that more expressive representation learning contributes to higher-quality interval forecasts with reliable coverage.

Through this coverage-guided refinement mechanism, the MTSA-QRN establishes a closed uncertainty-learning loop that compensates for extreme weather fluctuations and potential distribution mismatch. As a result, the proposed framework shows the potential to offer more trustworthy and well-calibrated probabilistic forecasts for risk-aware operational decision-making in real PV systems.

4. Experimental Results and Discussion

This section presents a comparative evaluation of the proposed MTSA-QRN model against representative probabilistic forecasting baselines. The analysis is based on real monitoring and supervisory control and data acquisition measurements collected from a 40 MW grid-connected photovoltaic plant in northern China, recorded at a 15 min temporal resolution, which is consistent with typical operational requirements for ultra-short-term PV forecasting. All models are evaluated under identical forecasting settings, and performance is assessed in terms of probabilistic accuracy, interval reliability, and predictive sharpness.

4.1. Evaluation Metrics

Probabilistic forecasting performance is evaluated using three widely recognized metrics that jointly assess distributional accuracy, reliability, and sharpness. The Continuous Ranked Probability Score (CRPS) is used to measure the overall discrepancy between a forecast cumulative distribution

F

and the actual observation

y

, expressed as

C R P S (F, y) = \int_{- \infty}^{+ \infty} {(F (z) - I (z \geq y))}^{2} d z

(24)

A lower CRPS value indicates a forecast distribution that more closely matches the observed outcome.

To assess the reliability of uncertainty quantification, the Prediction Interval Coverage Probability (PICP) is calculated as the proportion of observations falling within the prediction interval

[{\hat{y}}_{i}^{L}, {\hat{y}}_{i}^{U}]

:

P I C P = \frac{1}{N} \sum_{i = 1}^{N} I (y_{i} \in [{\hat{y}}_{i}^{L}, {\hat{y}}_{i}^{U}])

(25)

A well-calibrated forecast yields PICP values close to the nominal confidence level (e.g., 90%).

Meanwhile, the Prediction Interval Normalized Average Width (PINAW) quantifies the informativeness of prediction intervals:

P I N A W = \frac{1}{N \cdot R} \sum_{i = 1}^{N} ({\hat{y}}_{i}^{U} - {\hat{y}}_{i}^{L})

(26)

where

R = m a x (y) - m i n (y)

normalizes interval width with respect to the data range.

Smaller PINAW indicates sharper prediction intervals while retaining comparability across datasets and power scales.

Taken together, CRPS rewards accurate probabilistic distributions, PICP measures uncertainty reliability, and PINAW evaluates the usefulness of the produced intervals in operational decision-making.

4.2. Multi-Scale Decomposition of Historical PV Power

Historical PV power contains temporal components spanning multiple frequency scales, influenced by rapid cloud-motion fluctuations, diurnal irradiance cycles, and slowly varying atmospheric processes. To disentangle these behaviors, the historical power series is decomposed using VMD, which produces a finite set of intrinsic mode functions (IMFs) representing progressively decreasing frequency bands.

The number of decomposition modes is determined using an elbow-based criterion, which suggests an appropriate decomposition with seven intrinsic mode functions (IMFs). Figure 7 illustrates the original PV power series together with the resulting IMFs.

To identify the components that contribute most effectively to forecasting performance and to suppress redundant or noisy information, the predictive relevance of each IMF is further evaluated using an ensemble of six complementary dependency measures, including LASSO regression, Maximal Information Coefficient (MIC), Random Forest (RF) feature importance, Recursive Feature Elimination (RFE), Ridge regression, and Pearson correlation. Each IMF is assigned a normalized score under each criterion, and the average of these scores is used as a comprehensive importance index.

As shown in Figure 8, IMF₆ and IMF₇ consistently exhibit relatively low relevance scores (below the composite threshold of 0.2), suggesting that they mainly capture high-frequency components with limited contribution to predictive performance. In contrast, IMF₁–IMF₅ show substantially higher relevance and are therefore retained as input features for the forecasting model.

The reconstruction results in Figure 9 indicate that the selected five IMFs preserve the dominant temporal structure of the original PV power series, achieving a reconstruction coefficient of determination of 0.982, which confirms that the essential dynamics are well retained after feature selection.

4.3. Overall Probabilistic Forecasting Performance and Layer-Wise Ablation Analysis

This subsection first evaluates the overall probabilistic forecasting performance of the proposed MTSA-QRN against representative baseline models, and then quantitatively examines the contribution of key components through a layer-wise ablation analysis covering structural modeling, adaptive coordination, and reliability calibration.

Under the 90% prediction interval, MTSA-QRN achieves the lowest CRPS (0.0400) prior to calibration, substantially outperforming Quantile-GRU (0.0908), Quantile-LSTM (0.0995), and Quantile-Transformer (0.0954). However, its PICP is 0.8140, which is below the nominal 90% confidence level. After applying adaptive the CQR calibration, the PICP of MTSA-QRN increases to 0.9053, while the CRPS rises only slightly to 0.0453 and the PINAW becomes 0.3870, remaining considerably smaller than that of Quantile-SVR (0.5331). These results indicate a favorable trade-off between reliability and interval sharpness (Table 3).

Table 4 presents the layer-wise ablation results of MTSA-QRN. In the structural layer, removing the Transformer or TCN leads to a substantial increase in CRPS (to 0.1097 and 0.1127, respectively), underscoring the importance of multi-scale temporal modeling. Removing the GAT module increases CRPS to 0.0554 and noticeably enlarges the prediction interval width (PINAW = 0.5059), suggesting that modeling meteorological dependency contributes to improved distributional accuracy. In the adaptive layer, disabling the weather-adaptive module or the fusion gate results in pronounced performance degradation, with CRPS increasing to 0.1338 and 0.1434, respectively, highlighting the importance of adaptive feature coordination under varying weather conditions. In the calibration layer, removing CQR increases CRPS to 0.1284 and leads to less reliable coverage, indicating that explicit reliability calibration plays a critical role in improving the quality of probabilistic forecasts.

Overall, the results presented in Table 3 and Table 4 indicate that the superior probabilistic performance of MTSA-QRN arises from the coordinated effects of structural spatiotemporal modeling, adaptive scenario-aware integration, and reliability-driven calibration.

4.4. Calibration Performance Analysis

Reliable probabilistic forecasts require prediction intervals that accurately reflect the uncertainty of PV power generation under varying weather conditions. Although MTSA-QRN effectively models the predictive distribution, its raw quantile outputs show miscalibration relative to nominal confidence levels, a behavior commonly observed in deep models trained with the pinball loss.

To enforce the statistical validity of prediction intervals, the proposed framework incorporates an adaptive conformal calibration mechanism. Figure 10 compares the target and achieved coverage rates before and after calibration for confidence levels ranging from 80% to 95%. Prior to calibration, all PICP values fall below nominal levels, indicating a tendency toward under-dispersion. After calibration, the achieved coverage closely aligns with target levels across all confidence intervals, confirming the effectiveness of the proposed strategy in improving reliability.

Importantly, interval expansion during the calibration process remains moderate, as evidenced by Table 3, demonstrating a favorable trade-off between interval width and confidence satisfaction. Unlike baseline models that require substantial interval enlargement to achieve comparable coverage, MTSA-QRN benefits from a more coherent representation of distributional uncertainty, thereby preserving competitive predictive sharpness.

Overall, this analysis highlights the benefit of integrating model-based uncertainty learning with adaptive statistical calibration, ensuring the predictive credibility required for operational decision-making in PV power systems.

5. Conclusions and Future Work

This study proposes a MTSA-QRN for uncertainty-aware short-term photovoltaic power forecasting. The developed model addresses three major challenges in PV forecasting, namely the coexistence of multi-scale temporal patterns, nonlinear dependencies among meteorological variables, and weather-driven fluctuations. By integrating multi-scale decomposition, dilated temporal convolution with self-attention refinement, spatial dependency modeling via a graph attention mechanism, weather-adaptive feature transformation, and multi-quantile prediction equipped with adaptive calibration, the framework enables a comprehensive characterization of PV generation dynamics.

The experimental results confirm that the proposed MTSA-QRN generates more reliable and sharper probabilistic forecasts than competitive baseline models. In particular, learning spatial correlations among meteorological variables improves the ability to represent uncertainty under variable weather conditions, while the adaptive calibration strategy effectively aligns empirical coverage rates with target confidence levels. These findings demonstrate that the proposed probabilistic forecasting framework provides dependable uncertainty information for data-driven grid dispatching, reserve scheduling, and operational risk management.

Although the present work demonstrates promising performance and practical relevance, several avenues for further research remain. Future studies may incorporate physical constraints associated with PV conversion mechanisms to enhance interpretability and robust extrapolation, explore spatiotemporal extensions to jointly model multiple geographically dispersed PV plants, and investigate online learning strategies to address long-term distribution shifts in meteorological inputs. In addition, integrating economic evaluation metrics may provide deeper insights into the benefits of uncertainty-aware forecasting in electricity markets.

In summary, MTSA-QRN establishes a unified and data-efficient framework for short-term PV power forecasting with credible uncertainty quantification, supporting reliable renewable energy integration and operational decision-making in modern power systems.

Author Contributions

Conceptualization, G.W., Y.Z., Y.Y. and Z.Z.; methodology, G.W., Y.Z., Y.Y. and Z.Y.; software, G.W., Y.Z., L.D. and Z.Z.; validation, Z.Y., L.D. and J.H.; formal analysis, G.W., Y.Z., and L.D.; investigation, G.W., Y.Z. and Y.Y.; resources, Y.Z.; data curation, Y.Y. and Z.Z.; writing—original draft preparation, G.W. and Y.Y.; writing—review and editing, Y.Z., Y.Y. and Z.Z.; visualization, G.W. and Z.Z.; supervision, Y.Z.; project administration, Y.Z. and Y.Y.; funding acquisition, Y.Z. and Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Lianyungang City Key Research and Development Program (Industrial Forward-Looking and Critical Core Technologies): Research on Ultra-Short-Term Probabilistic Forecasting of Photovoltaic Power Incorporating Micro-Meteorological Spatiotemporal Correlation (Project No. CG2315), and the Ningxia Natural Science Foundation Project under Grant 2023AAC03836.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhou, Y.; Sun, Y.; Wang, S.; Bai, L.; Hou, D.; Mahfoud, R.J.; Wang, P. Very Short-Term Probabilistic Prediction Method for Wind Speed Based on ALASSO-Nonlinear Quantile Regression and Integrated Criterion. CSEE J. Power Energy Syst. 2021, 9, 2121–2129. [Google Scholar]
Zhou, Y.; Sun, Y.; Wang, S.; Mahfoud, R.J.; Alhelou, H.H.; Hatziargyriou, N.; Siano, P. Performance Improvement of Very Short-Term Prediction Intervals for Regional Wind Power Based on Composite Conditional Nonlinear Quantile Regression. J. Mod. Power Syst. Clean Energy 2022, 10, 60–70. [Google Scholar] [CrossRef]
Zhou, Y.; Wei, F.; Kuang, K.; Mahfoud, R.J. Research on A Deep Ensemble Learning Model for the Ultra-Short-Term Probabilistic Prediction of Wind Power. Electronics 2024, 13, 475. [Google Scholar] [CrossRef]
Kuang, K.; Zhang, J.; Chen, Q.; Zhou, Y.; Yan, Y.; Dai, L.; Wang, G. Short-Term Prediction Intervals for Photovoltaic Power via Multi-Level Analysis and Dual Dynamic Integration. Electronics 2025, 14, 3068. [Google Scholar] [CrossRef]
Sun, Y.; Zhou, Y.; Wang, S.; Mahfoud, R.J.; Alhelou, H.H.; Sideratos, G.; Hatziargyriou, N.; Siano, P. Nonparametric Probabilistic Prediction of Regional PV Outputs Based on Granule-based Clustering and Direct Optimization Programming. J. Mod. Power Syst. Clean Energy 2023, 11, 1450–1461. [Google Scholar] [CrossRef]
Yang, M.; Guo, Y.; Fan, F. Ultra-Short-Term Prediction of Wind Farm Cluster Power Based on Embedded Graph Structure Learning with Spatiotemporal Information Gain. IEEE Trans. Sustain. Energy 2025, 16, 308–322. [Google Scholar] [CrossRef]
Sobri, S.; Koohi-Kamali, S.; Rahim, N.A. Solar Photovoltaic Generation Forecasting Methods: A review. Energy Convers. Manag. 2018, 156, 459–497. [Google Scholar] [CrossRef]
Perez, R.; Lorenz, E.; Pelland, S.; Beauharnois, M.; Knowe, G.V.; Hemker, K.; Heinemann, D.; Remund, J.; Müller, S.C.; Traunmüller, W.; et al. Comparison of Numerical Weather Prediction Solar Irradiance Forecasts in the US, Canada and Europe. Sol. Energy 2013, 94, 305–326. [Google Scholar] [CrossRef]
Mathiesen, P.; Kleissl, J. Evaluation of Numerical Weather Prediction for Intra-Day Solar Forecasting in the Continental United States. Sol. Energy 2011, 85, 967–977. [Google Scholar] [CrossRef]
Mohandes, M.A.; Halawani, T.O.; Rehman, S.; Hussain, A.A. Support Vector Machines for Wind Speed Prediction. Renew. Energy 2004, 29, 939–947. [Google Scholar] [CrossRef]
Fernandez-Jimenez, L.A.; Muñoz-Jimenez, A.; Falces, A.; Mendoza-Villena, M.; Garcia-Garrido, E.; Lara-Santillan, P.M.; Zorzano-Alba, E.; Zorzano-Santamaria, P.J. Short-Term Power Forecasting System for Photovoltaic Plants. Renew. Energy 2012, 44, 311–317. [Google Scholar] [CrossRef]
Chen, C.; Duan, S.; Cai, T.; Liu, B. Online 24-h Solar Power Forecasting Based on Weather Type Classification Using Artificial Neural Network. Sol. Energy 2011, 85, 2856–2870. [Google Scholar] [CrossRef]
Wang, H.; Lei, Z.; Zhang, X.; Zhou, B.; Peng, J. A Review of Deep Learning for Renewable Energy Forecasting. Energy Convers. Manag. 2019, 198, 111799. [Google Scholar] [CrossRef]
Sarmas, E.; Dimitropoulos, N.; Marinakis, V.; Mylona, Z.; Doukas, H. Transfer Learning Strategies for Solar Power Forecasting under Data Scarcity. Sci. Rep. 2022, 12, 14643. [Google Scholar] [CrossRef] [PubMed]
Gao, M.; Li, J.; Hong, F.; Long, D. Short-Term Forecasting of Power Production in a Large-Scale Photovoltaic Plant Based on LSTM. Appl. Sci. 2019, 9, 3192. [Google Scholar] [CrossRef]
Abdel-Nasser, M.; Mahmoud, K. Accurate Photovoltaic Power Forecasting Models Using Deep LSTM-RNN. Neural Comput. Appl. 2019, 31, 2727–2740. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Wang, Y.; Shen, Y.; Mao, S.; Cao, G.; Nelms, R.M. Adaptive Learning Hybrid Model for Solar Intensity Forecasting. IEEE Trans. Ind. Inform. 2018, 14, 1635–1645. [Google Scholar] [CrossRef]
Zhang, Y.; Beaudin, M.; Taheri, R.; Zareipour, H.; Wood, D. Day-Ahead Power Output Forecasting for Small-Scale Solar Photovoltaic Electricity Generators. IEEE Trans. Smart Grid 2015, 6, 2253–2262. [Google Scholar] [CrossRef]
Van der Meer, D.W.; Widén, J.; Munkhammar, J. Review on Probabilistic Forecasting of Photovoltaic Power Production and Electricity Consumption. Renew. Sustain. Energy Rev. 2018, 81, 1484–1512. [Google Scholar] [CrossRef]
Zhou, Y.; Sun, Y.; Wang, S.; Mahfoud, R.J.; Hou, D.; Wang, J. Very Short-Term Probabilistic Prediction for Regional Wind Power Generation Based on Opnpis. CSEE J. Power Energy Syst. 2024, 1–10. [Google Scholar]
Wang, S.; Zhang, W.; Sun, Y.; Trivedi, A.; Chung, C.Y.; Srinivasan, D. Wind Power Forecasting in the presence of data scarcity: A very short-term conditional probabilistic modeling framework. Energy 2024, 291, 130305. [Google Scholar] [CrossRef]
Koenker, R.; Bassett, G. Regression Quantiles. Econom. Soc. 1978, 46, 33–50. [Google Scholar] [CrossRef]
Koenker, R.; Hallock, K.F. Quantile Regression. J. Econ. Perspect. 2001, 15, 143–156. [Google Scholar] [CrossRef]
Taylor, J.W. A Quantile Regression Neural Network Approach to Estimating the Conditional Density of Multiperiod Returns. J. Forecast. 2000, 19, 299–311. [Google Scholar] [CrossRef]
He, Q.; Zhao, M.; Li, S.; Li, X.; Wang, Z. Machine Learning Prediction of Photovoltaic Hydrogen Production Capacity Using Long Short-Term Memory Model. Energies 2025, 18, 543. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
Khosravi, A.; Nahavandi, S.; Creighton, D.; Atiya, A.F. Lower Upper Bound Estimation Method for Construction of Neural Network-Based Prediction Intervals. IEEE Trans. Neural Netw. 2010, 22, 337–346. [Google Scholar] [CrossRef]
Pearce, T.; Brintrup, A.; Zaki, M.; Neely, A. High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach. Int. Conf. Mach. Learn. 2018, 80, 4075–4084. [Google Scholar]
Romano, Y.; Patterson, E.; Candes, E. Conformalized Quantile Regression. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
Ye, L.; Liu, M.; Fu, D.; Wu, H.; Shi, H.; Huang, C. Probabilistic Site Adaptation for High-Accuracy Solar Radiation Datasets in the Western Sichuan Plateau. Remote Sens. 2025, 17, 1720. [Google Scholar] [CrossRef]
Sooriyaarachchi, V.; Wijeratne, L.O.; Waczak, J.; Patra, R.; Lary, D.J.; Zhang, Y. Enhancing Hyperlocal Wavelength-Resolved Solar Irradiance Estimation Using Remote Sensing and Machine Learning. Remote Sens. 2025, 17, 2753. [Google Scholar] [CrossRef]
Straub, N.; Herzberg, W.; Lorenz, E. HelioNet-IR: Combining Infrared and Visible Satellite Images for Solar Irradiance Forecasting in the Early-Morning Hours. Sol. RRL 2025, 9, 2500365. [Google Scholar] [CrossRef]
Chu, Y.; Wang, Y.; Yang, D.; Chen, S.; Li, M. A Review of Distributed Solar Forecasting with Remote Sensing and Deep Learning. Renew. Sustain. Energy Rev. 2024, 198, 114391. [Google Scholar] [CrossRef]
Sebastianelli, A.; Serva, F.; Ceschini, A.; Paletta, Q.; Panella, M.; Le Saux, B. Machine Learning Forecast of Surface Solar Irradiance from Meteo Satellite Data. Remote Sens. Environ. 2024, 315, 114431. [Google Scholar] [CrossRef]
Lave, M.; Kleissl, J. Solar Variability of Four Sites Across the State of Colorado. Renew. Energy 2010, 35, 2867–2873. [Google Scholar] [CrossRef]
Hoff, T.E.; Perez, R. Quantifying PV Power Output Variability. Sol. Energy 2010, 84, 1782–1793. [Google Scholar] [CrossRef]
Wang, S.; Sun, Y.; Zhang, W.; Chung, C.Y.; Srinivasan, D. Very short-term wind power forecasting considering static data: An improved transformer model. Energy 2024, 312, 133577. [Google Scholar] [CrossRef]
Marcos, J.; Marroyo, L.; Lorenzo, E.; Alvira, D.; Izco, E. From Irradiance to Output Power Fluctuations: The PV Plant as a Low Pass Filter. Prog. Photovolt. Res. Appl. 2011, 19, 505–510. [Google Scholar] [CrossRef]
Liu, L.; Zhan, M.; Bai, Y. A Recursive Ensemble Model for Forecasting the Power Output of Photovoltaic Systems. Sol. Energy 2019, 189, 291–298. [Google Scholar] [CrossRef]
Wang, F.; Xuan, Z.; Zhen, Z.; Li, K.; Wang, T.; Shi, M. A Day-Ahead PV Power Forecasting Method Based on LSTM-RNN Model and Time Correlation Modification Under Partial Daily Pattern Prediction Framework. Energy Convers. Manag. 2020, 212, 112766. [Google Scholar] [CrossRef]
Yang, M.; Shen, X.; Huang, D.; Su, X. Fluctuation Classification and Feature Factor Extraction to Forecast very Short-Term Photovoltaic Output Powers. CSEE J. Power Energy Syst. 2023, 11, 661–670. [Google Scholar]
Diagne, M.; David, M.; Lauret, P.; Boland, J.; Schmutz, N. Review of Solar Irradiance Forecasting Methods and a Proposition for Small-Scale Insular Grids. Renew. Sustain. Energy Rev. 2013, 27, 65–76. [Google Scholar] [CrossRef]
Dolara, A.; Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. A Physical Hybrid Artificial Neural Network for Short Term Forecasting of PV Plant Power Output. Energies 2015, 8, 1138–1153. [Google Scholar] [CrossRef]
Litjens, G.B.M.A.; Worrell, E.; Van Sark, W.G.J.H.M. Assessment of Forecasting Methods on Performance of Photovoltaic-Battery Systems. Appl. Energy 2018, 221, 358–373. [Google Scholar] [CrossRef]
Wang, F.; Zhang, Z.; Liu, C.; Yu, Y.; Pang, S.; Duić, N.; Shafie-Khah, M.; Catalao, J.P. Generative Adversarial Networks and Convolutional Neural Networks Based Weather Classification Model for Day Ahead Short-Term Photovoltaic Power Forecasting. Energy Convers. Manag. 2019, 181, 443–462. [Google Scholar] [CrossRef]
Yang, M.; Jiang, Y.; Xu, C.; Wang, B.; Wang, Z.; Su, X. Day-Ahead Wind Farm Cluster Power Prediction Based on Trend Categorization and Spatial Information Integration Model. Appl. Energy 2025, 388, 125580. [Google Scholar] [CrossRef]
Li, Z.; Ye, L.; Song, X.; Luo, Y.; Pei, M.; Wang, K.; Yu, Y.; Tang, Y. Heterogeneous Spatiotemporal Graph Convolution Network for Multi-Modal Wind-PV Power Collaborative Prediction. IEEE Trans. Power Syst. 2023, 39, 5591–5608. [Google Scholar] [CrossRef]
Agga, A.; Abbou, A.; Labbadi, M.; El Houm, Y. Short-Term Self Consumption PV Plant Power Production Forecasts Based on Hybrid CNN-LSTM, ConvLSTM Models. Renew. Energy 2021, 177, 101–112. [Google Scholar] [CrossRef]
Lu, S.; Xu, Q.; Jiang, C.; Liu, Y.; Kusiak, A. Probabilistic Load Forecasting with a Non-Crossing Sparse-Group Lasso-Quantile Regression Deep Neural Network. Energy 2022, 242, 122955. [Google Scholar] [CrossRef]
Lu, S.; Gao, Z.; Xu, Q.; Jiang, C.; Xie, T.; Zhang, A. Remaining Useful Life Prediction via Interactive Attention-Based Deep Spatio-Temporal Network Fusing Multisource Information. IEEE Trans. Ind. Electron 2023, 71, 8007–8016. [Google Scholar] [CrossRef]

Figure 1. Illustration of multi-head attention projection mechanism. The input feature matrix is linearly projected into queries (

Q

), keys (

K

), and values (

V

), which are then split into

h

parallel attention heads. Each head independently computes scaled dot-product attention (Equation (2)), and the outputs are concatenated and projected through

W^{O} \in R^{h d_{k} \times d}

to restore the original dimensionality.

Figure 1. Illustration of multi-head attention projection mechanism. The input feature matrix is linearly projected into queries (

Q

), keys (

K

), and values (

V

), which are then split into

h

parallel attention heads. Each head independently computes scaled dot-product attention (Equation (2)), and the outputs are concatenated and projected through

W^{O} \in R^{h d_{k} \times d}

to restore the original dimensionality.

Figure 2. Overall architecture of the proposed MTSA-QRN. The framework integrates four key modules: (1) multi-scale temporal modeling for capturing both transient fluctuations and evolving trends of PV power dynamics, (2) graph-based meteorological interaction learning to represent nonlinear dependencies among remote-sensing/NWP-driven weather variables, (3) weather-adaptive feature refinement to modulate spatiotemporal representations under distinct atmospheric conditions (sunny, cloudy, rainy), and (4) probabilistic quantile prediction with adaptive calibration to ensure reliable uncertainty quantification for operational decision-making.

Figure 3. Core spatiotemporal learning mechanism of MTSA-QRN. Multi-scale temporal dependencies are captured via TCN, meteorological correlations are modeled by GAT, and a weather-adaptive module dynamically re-weights fused representations based on atmospheric regimes (sunny, cloudy, rainy), enabling robust feature learning under varying uncertainty conditions.

Figure 4. Multi-head self-attention temporal refinement in the temporal pathway of MTSA-QRN.

Figure 5. Typical normalized PV power generation profiles under different weather conditions showing regime-dependent volatility and trend behaviors.

Figure 6. Convergence trajectory of PICP toward the target coverage under golden-ratio adaptive boundary updates, showing improved reliability under varying meteorological uncertainty.

Figure 7. Multi-scale decomposition of historical PV power using variational mode decomposition (

K = 7

). Note: Only the first five components (IMF₁–IMF₅) are retained for the final model input, as determined by the feature importance analysis in Figure 8.

Figure 7. Multi-scale decomposition of historical PV power using variational mode decomposition (

K = 7

). Note: Only the first five components (IMF₁–IMF₅) are retained for the final model input, as determined by the feature importance analysis in Figure 8.

Figure 8. Feature relevance of intrinsic mode functions across six dependency measures.

Figure 9. Reconstruction quality of the VMD representation.

Figure 10. Comparison of target and achieved coverage after adaptive CQR.

Table 1. Summary of typical photovoltaic power forecasting approaches.

Method	Model	Features	References
Physical model	NWP-based, irradiance-to-power conversion	This method has strong physical interpretability, but is sensitive to parameter uncertainties and weather variations.	[8,9]
Statistical learning	ARIMA, regression analysis	The principle of this method is simple and computationally efficient, but it has limited capability for nonlinear pattern modeling.	[11,43]
Machine learning	SVR, RF, ELM	This method can capture nonlinear relationships effectively, but is prone to overfitting with limited training samples.	[10,12,19]
Deep learning	CNN, LSTM, GRU, Transformer	This method has strong temporal feature extraction ability, but often ignores meteorological spatial correlations.	[15,16,17,27,38]
Hybrid method	CNN-LSTM, EMD-LSTM, GAT-based	This method integrates multiple techniques for improved accuracy, but increases model complexity and training difficulty.	[44,46,49]

Table 2. Ablation study on spatial dependency modeling.

Model Configuration	CRPS	PICP_80%	PICP_85%	PICP_90%	PICP_95%	PINAW_80%	PINAW_85%	PINAW_90%	PINAW_95%
Baseline (No GAT)	0.0554	77.74%	87.25%	93.63%	97.89%	0.4083	0.4233	0.5059	0.5562
+GAT (Fixed)	0.0490	78.48%	83.25%	88.02%	93.97%	0.2920	0.3368	0.4071	0.5041
+GAT (Adaptive)	0.0453	81.01%	85.52%	90.53%	95.61%	0.2737	0.3145	0.3870	0.4960

Table 3. Comparison of probabilistic forecasting performance before and after adaptive calibration (90% prediction interval).

Model	Metric	Before Calibration	After Calibration
Quantile-SVR	CRPS	0.4758	0.4484
	PICP (90%)	0.7242	0.8733
	PINAW (90%)	0.0064	0.5331
Quantile-GRU	CRPS	0.0908	0.0887
	PICP (90%)	0.8181	0.9221
	PINAW (90%)	0.0053	0.3039
Quantile-LSTM	CRPS	0.0995	0.0902
	PICP (90%)	0.7572	0.9157
	PINAW (90%)	0.0044	0.2610
Quantile-Transformer	CRPS	0.0954	0.0932
	PICP (90%)	0.8241	0.9130
	PINAW (90%)	0.0401	0.2369
MTSA-QRN	CRPS	0.0400	0.0453
	PICP (90%)	0.8140	0.9053
	PINAW (90%)	0.3116	0.3870

Table 4. Layer-wise ablation study of MTSA-QRN under the 90% prediction interval.

Model Configuration (Layer-Tagged)	CRPS	ΔCRPS (%)	PICP (90%)	PINAW (90%)
MTSA-QRN	0.0453	— —	0.9053	0.3870
(Structural) w/o Transformer (No global temporal attention)	0.1097	+142.2%	0.9323	0.3481
(Structural) w/o TCN (No multi-scale temporal convolution)	0.1127	+148.9%	0.9402	0.3696
(Structural) w/o GAT (No meteorological dependency modeling)	0.0554	+22.3%	0.9363	0.5059
(Adaptive) w/o Weather-Adaptive module (Fixed regime modeling)	0.1338	+195.4%	0.9196	0.3190
(Adaptive) w/o Fusion Gate (Fixed feature fusion)	0.1434	+216.6%	0.9141	0.3215
(Calibration) w/o CQR (No reliability calibration)	0.1284	+183.4%	0.9282	0.3178

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, G.; Zhou, Y.; Yan, Y.; Zhou, Z.; Yang, Z.; Dai, L.; Huang, J. Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression. Sustainability 2026, 18, 739. https://doi.org/10.3390/su18020739

AMA Style

Wang G, Zhou Y, Yan Y, Zhou Z, Yang Z, Dai L, Huang J. Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression. Sustainability. 2026; 18(2):739. https://doi.org/10.3390/su18020739

Chicago/Turabian Style

Wang, Guanghu, Yan Zhou, Yan Yan, Zhihan Zhou, Zikang Yang, Litao Dai, and Junpeng Huang. 2026. "Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression" Sustainability 18, no. 2: 739. https://doi.org/10.3390/su18020739

APA Style

Wang, G., Zhou, Y., Yan, Y., Zhou, Z., Yang, Z., Dai, L., & Huang, J. (2026). Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression. Sustainability, 18(2), 739. https://doi.org/10.3390/su18020739

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Probabilistic Photovoltaic Power Forecasting with Reliable Uncertainty Quantification via Multi-Scale Temporal–Spatial Attention and Conformalized Quantile Regression

Abstract

1. Introduction

2. Methodological Background

2.1. Temporal Dependency Modeling Using Attention Mechanisms

2.2. Graph Attention for Meteorological Interaction Representation

2.3. Quantile Regression and Calibration for Uncertainty Modeling

3. Proposed Methodology

3.1. Multi-Scale Temporal-Attention Dependency Enhancement

3.2. Meteorological Interaction Enhancement via Adaptive Graph Attention

3.3. Weather-Adaptive Feature Fusion

3.4. Probabilistic Prediction and Adaptive Calibration

4. Experimental Results and Discussion

4.1. Evaluation Metrics

4.2. Multi-Scale Decomposition of Historical PV Power

4.3. Overall Probabilistic Forecasting Performance and Layer-Wise Ablation Analysis

4.4. Calibration Performance Analysis

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI