1. Introduction
Solar PV energy is widely regarded as one of the most promising renewable energy alternatives owing to its environmental sustainability, technological maturity, and ubiquitous availability [
1]. PV systems have consequently become an essential component of sustainable energy development worldwide [
2]. According to the International Energy Agency, the global cumulative installed PV capacity reached 2.2 TW by the end of 2024, with 602 GW of new installations added in that year alone, representing a year-on-year growth of 32% [
3]. China, as the world’s largest PV market, achieved 887 GW of total installed capacity by the end of 2024, accounting for approximately 40% of global capacity, with a record-breaking 277 GW added in 2024 [
4]. With annual capacity growth rates exceeding 30% in recent years, the solar photovoltaic industry has entered a phase of rapid expansion.
To support reliable grid operation, accurate short-term PV forecasting has become increasingly important. Existing forecasting approaches can generally be categorized into physical models, statistical learning methods, and hybrid frameworks [
5]. Physical models depend on numerical weather predictions but are sensitive to parameter uncertainties [
6,
7]. Data-driven and deep learning models, including CNNs, LSTMs, and attention architectures, have demonstrated competitive performance by capturing nonlinear temporal patterns [
8,
9,
10,
11,
12,
13,
14,
15,
16]. In recent years, probabilistic forecasting has attracted growing attention due to its ability to quantify forecast uncertainty via prediction intervals or distributions, thereby enabling risk-aware decision-making in power system operations [
17,
18,
19,
20]. Quantile regression is widely adopted due to its distribution-free nature [
21,
22,
23,
24,
25], yet deep quantile models often exhibit miscalibrated interval coverage under changing weather conditions [
26,
27]. Conformalized Quantile Regression (CQR) provides distribution-free statistical coverage [
28], but efficient calibration strategies for large-scale PV forecasting remain underexplored.
Meanwhile, advances in remote sensing technologies have provided critical support for solar irradiance estimation and PV forecasting. Satellite-derived irradiance and meteorological products offer wide spatial coverage and effectively capture cloud-driven rapid variability, which is essential for improving the credibility of probabilistic forecasts. Consequently, the integration of deep learning architectures with uncertainty-aware modeling using remote-sensing data has emerged as an important research direction, aiming to enhance predictive robustness in high-penetration renewable energy systems [
29,
30,
31,
32,
33].
Despite the progress achieved to date, two key challenges remain. First, PV generation exhibits complex multi-scale temporal dynamics ranging from second-level fluctuations to seasonal patterns due to cloud movement, diurnal cycles, and synoptic-scale changes [
34,
35,
36,
37]. Existing models struggle to jointly capture these hierarchical dependencies in an end-to-end manner. Second, forecasting accuracy is strongly influenced by nonlinear interdependencies among meteorological variables such as irradiance, temperature, and wind speed [
38,
39,
40,
41]. Although attention mechanisms enable adaptive temporal modeling [
42,
43], spatial correlations among meteorological features are often overlooked [
44,
45]. Graph Attention Networks (GATs) can infer dynamic dependency structures [
46,
47], yet their integration into probabilistic PV forecasting remains insufficiently explored under varying weather conditions [
48,
49].
To provide a systematic summary of the above discussion,
Table 1 compares typical photovoltaic power forecasting approaches based on their technical routes.
Beyond the specific domain of PV forecasting, recent studies have significantly advanced probabilistic and spatiotemporal modeling in related time-series fields. For instance, in power load forecasting, a non-crossing sparse-group Lasso-quantile regression deep neural network has been proposed to effectively address the quantile crossing problem through regularization [
50]. Similarly, regarding complex spatiotemporal dynamics, interactive attention-based deep networks have been successfully applied to remaining useful life prediction by fusing multisource information [
51].
However, compared with these existing architectures, the specific characteristics of photovoltaic power generation require specialized structural designs to address remaining challenges. First, PV generation exhibits distinct multi-scale temporal dynamics (e.g., high-frequency cloud transients vs. low-frequency diurnal cycles) that differ from general load or equipment degradation patterns, requiring explicit disentanglement. Second, standard interactive attention mechanisms often lack the ability to dynamically adapt to varying atmospheric regimes (e.g., Sunny vs. Rainy), which is essential for accurately modeling meteorological interactions in PV systems. Finally, to ensure statistically valid prediction intervals under extreme weather shifts, an adaptive calibration strategy is necessary beyond standard regularization techniques.
Given that PV power generation is jointly influenced by multi-scale temporal dynamics and spatially correlated meteorological disturbances, existing approaches still face limitations in accurately capturing these complex behaviors and reliably quantifying forecasting uncertainty. To address these challenges, this paper proposes a MTSA-QRN for short-term probabilistic PV power forecasting. First, dilated convolutions and multi-head self-attention are jointly employed to model hierarchical temporal dependencies, enhancing the simultaneous characterization of high-frequency cloud-induced fluctuations and low-frequency diurnal evolution. Second, an adaptive Graph Attention Network is introduced to explicitly capture nonlinear interactions among meteorological variables, enabling more responsive modeling of weather-driven volatility. Moreover, a weather-adaptive fusion mechanism is designed to dynamically adjust the contribution of temporal and spatial features under different atmospheric regimes, improving robustness across diverse meteorological conditions. Finally, an adaptive conformalized quantile regression strategy is incorporated to calibrate prediction intervals, ensuring statistically reliable coverage performance at multiple confidence levels. The main contributions of this study are summarized as follows:
A unified temporal–spatial probabilistic forecasting framework is developed by systematically coordinating multi-scale temporal modeling and spatial dependency learning. Specifically, dilated TCN and multi-head self-attention are jointly employed to capture hierarchical temporal dependencies, while GAT-based modeling is used to characterize nonlinear meteorological interactions. Rather than introducing new standalone modules, this design enables a more comprehensive representation of PV generation dynamics under multi-source disturbances, leading to improved predictive performance in complex scenarios, as validated by comparative experiments.
To address regime-dependent photovoltaic power behaviors, a weather-related feature fusion strategy is designed to adaptively adjust the contributions of temporal and spatial representations. This mechanism allows the model to preserve stable trend information under relatively smooth conditions while remaining sensitive to pronounced power fluctuations, thereby alleviating the performance degradation commonly caused by fixed fusion schemes and improving generalization across varying meteorological situations.
An adaptive conformalized quantile regression (CQR) calibration strategy is incorporated to balance prediction interval coverage reliability and interval sharpness. By iteratively adjusting calibration parameters in a data-driven manner, the proposed approach avoids overly conservative interval widening and yields more informative uncertainty estimates. Experimental results demonstrate that this calibration strategy provides consistent coverage performance while maintaining practical usefulness for risk-aware decision-making.
The remainder of this paper is organized as follows.
Section 2 presents the theoretical background underpinning the proposed methodology, including multi-scale decomposition, attention mechanisms, and probabilistic modeling formulations.
Section 3 introduces the MTSA-QRN model in detail, covering temporal–spatial representation learning, weather-adaptive feature fusion, and adaptive quantile calibration.
Section 4 reports the experimental evaluation conducted on real PV operational data and discusses comparative probabilistic forecasting performance against representative baselines. Finally,
Section 5 concludes the study and outlines future research directions.
2. Methodological Background
Reliable probabilistic PV forecasting requires modeling temporal dependencies, meteorological feature interactions, and statistically valid uncertainty quantification. This section briefly introduces the theoretical foundations supporting the proposed temporal–spatial attention architecture and adaptive calibration strategy, while
Section 3 elaborates the core innovations.
2.1. Temporal Dependency Modeling Using Attention Mechanisms
Photovoltaic power time series are highly non-stationary due to cloud dynamics and atmospheric instability, which cause rapid changes in irradiance and strong short-term fluctuations in power output. To adaptively capture temporal dependencies under varying meteorological conditions, attention mechanisms dynamically emphasize the most informative timestamps and features.
Let
denote the input feature matrix, where
denotes the number of time steps and
denotes the feature dimension including historical PV power and meteorological variables such as irradiance, temperature, humidity, and wind speed. The query (
Q), key (
K), and value (
V) representations are obtained through three learnable projection matrices
,
,
:
In this formulation,
represents the direction of attention that the model aims to focus on at a given time,
encodes the contextual information provided by each temporal or meteorological feature, and
carries the semantic or numerical content to be aggregated according to the learned attention weights. The similarity between each query and key determines the attention scores, which are normalized using the softmax function to ensure that the attention weights sum to one. The resulting self-attention output is given by
Here, the scaling factor prevents the dot-product values from becoming excessively large, which could lead to vanishing gradients. This mechanism allows the network to dynamically adjust the relative importance of each time step or meteorological feature, thereby enabling adaptive modeling of temporal dependencies and spatial correlations under varying weather conditions. Such flexibility is crucial for PV forecasting, where the relationship between irradiance and power generation can fluctuate rapidly.
Building upon this foundation, the multi-head attention mechanism extends self-attention by performing multiple parallel attention computations in different representation subspaces. This design allows the model to capture diverse patterns and dependencies simultaneously. The multi-head attention is formulated as
To clarify the projection and mapping relationships in Equation (3),
Figure 1 illustrates the complete multi-head attention mechanism. Given the input feature matrix
, linear transformations parameterized by
,
, and
are applied to project the input into query (
), key (
), and value (
) representations. These representations are further mapped into multiple subspaces corresponding to different attention heads. Each head performs scaled dot-product attention (Equation (2)) in a lower-dimensional subspace, typically with dimensionality
. The outputs from all attention heads are then concatenated and passed through an output projection matrix
to restore the original feature dimensionality.
Each attention head operates in a distinct subspace, enabling the network to focus on different aspects of temporal and meteorological dependencies. Some heads may specialize in short-term fluctuations, such as those caused by rapid cloud movements, while others capture long-term patterns such as diurnal and seasonal trends. Through this multi-head structure, the model achieves a more comprehensive understanding of the complex nonlinear relationships in PV power time series, effectively integrating both fine-grained dynamics and global temporal patterns.
This attention-based temporal modeling provides the foundation for the temporal pathway enhancement introduced in
Section 3.1.
2.2. Graph Attention for Meteorological Interaction Representation
PV generation is influenced by multiple meteorological variables whose nonlinear relationships evolve across weather conditions. Graph Attention Networks (GATs) enable adaptive modeling of such feature interactions. Let
denote the representation of the
meteorological variable, and
represents its neighborhood in the constructed feature graph. A learnable vector
and weight matrix
are used to compute the attention coefficient
between variables
and
:
The updated feature embedding for variable
is obtained by aggregating contextual information from its neighbors as
This mechanism provides the spatial dependency modeling basis for the fusion strategy introduced in
Section 3.2.
2.3. Quantile Regression and Calibration for Uncertainty Modeling
To explicitly quantify predictive uncertainty in PV power forecasting, this study adopts a quantile regression framework, which estimates conditional quantiles of the target variable without assuming an explicit parametric form of the conditional output distribution. Unlike deterministic point forecasting, uncertainty is represented through multiple quantile-dependent outputs that characterize the conditional distribution of PV power given the input features.
For a quantile level
, the prediction function
is obtained by minimizing the asymmetric pinball loss
where
denotes the observed PV power and
represents the predicted value at quantile level
. Different quantile levels capture different parts of the conditional distribution, thereby explicitly reflecting predictive uncertainty.
Based on the estimated quantiles, prediction intervals can be constructed by combining lower and upper estimated quantiles
and
, expressed as
Here, and represent the lower and upper quantile levels, respectively, and the resulting interval provides an interpretable measure of forecast uncertainty at a specified confidence level.
However, distribution shifts and finite-sample effects may cause predicted intervals to deviate from nominal confidence levels. To improve coverage reliability, conformal calibration introduces an adaptive adjustment term
to both bounds:
In this formulation,
represents the calibration parameter learned from calibration data, which compensates for systematic under-coverage or over-coverage of the raw quantile-based intervals. The calibrated interval
thus provides a more reliable uncertainty estimate while preserving the distribution-free property of quantile regression. This formulation supports the development of the adaptive calibration strategy expanded in
Section 3.4.
3. Proposed Methodology
The MTSA-QRN is designed to learn multi-scale dynamic variations in PV power sequences while capturing complex meteorological dependencies that evolve under varying weather conditions. The overall framework is illustrated in
Figure 2. Historical PV power data and meteorological variables are separately processed through temporal and spatial learning pathways. The temporal pathway extracts both transient and evolving patterns via multi-scale convolutional structures, while the spatial pathway employs graph attention to represent nonlinear coupling among meteorological drivers such as irradiance attenuation and wind-induced cooling. The two representations are further guided by a latent weather-state representation to enhance predictive robustness under sunny, cloudy, and rainy conditions. Finally, the unified spatiotemporal embeddings are fed into a quantile prediction module and calibrated to ensure statistically reliable uncertainty quantification for operational decision-making.
To further clarify how temporal and meteorological dependencies interact within the model, the core feature learning mechanism is additionally illustrated in
Figure 3, where the TCN-based temporal pathway and the GAT-based spatial pathway are jointly modulated through a weather-adaptive fusion strategy. This mechanism dynamically adjusts the contribution of heterogeneous features under varying atmospheric disturbances, constructing weather-aware representations that enhance robustness against prediction performance degradation.
Through this coordinated architecture, MTSA-QRN effectively integrates multi-scale temporal dependency modeling, meteorological interaction enhancement, weather-adaptive refinement, and reliability-aware quantile estimation. These four innovations collectively enhance forecasting performance in high-uncertainty environments such as severe cloud instability and sudden irradiance fluctuations. Detailed methodological advancements are presented in
Section 3.1,
Section 3.2,
Section 3.3 and
Section 3.4.
3.1. Multi-Scale Temporal-Attention Dependency Enhancement
PV power output exhibits highly non-stationary temporal dynamics with different fluctuation mechanisms operating at distinct time scales, which are difficult to jointly capture using conventional attention modules. To address this limitation, a multi-scale temporal-attention dependency enhancement mechanism is introduced. Specifically, temporal representations are extracted across multiple receptive fields using dilated convolutions with different dilation factors to separate cloud-driven short-term variability from diurnal trend evolution. These multi-scale temporal features form complementary temporal dependencies that cannot be learned at a single resolution.
To further refine temporal focus, an attention-based temporal gating mechanism is applied to dynamically weight the multi-scale representations. Given multi-scale features
, temporal attention weights
are computed as
where
is a learnable scoring function measuring temporal relevance under current meteorological input. The enhanced temporal representation is derived as
Thus, transient disturbances such as cloud transients receive higher contributions during unstable weather periods, whereas low-frequency trends dominate in stable conditions. This formulation establishes an adaptive mechanism that selectively attends to different temporal scales as meteorological dynamics shift, providing enhanced temporal generalization not achievable by isolated TCN or standard self-attention architectures.
This formulation enables the attention mechanism to automatically increase the contribution of short-term fluctuations during highly unstable weather periods, while assigning greater importance to low-frequency variations in clear and stable conditions. Consequently, the network establishes an adaptive multi-scale temporal focus aligned with evolving atmospheric dynamics—significantly improving temporal generalization beyond isolated TCN or standard attention architectures.
From the perspective of photovoltaic power forecasting, the above optimization process plays a clear functional role at different stages of temporal prediction. The multi-scale temporal representations extracted by dilated convolutions provide complementary views of short-term fluctuations and long-term trends in PV power output. Subsequently, the attention-based temporal gating mechanism adaptively reweights these representations, allowing the model to emphasize informative temporal components given prevailing meteorological conditions. As a result, the enhanced temporal representation serves as an optimized feature embedding that directly supports subsequent forecasting stages, improving both temporal generalization and predictive robustness.
To further illustrate the refinement process,
Figure 4 visualizes the multi-head self-attention refinement module used to inject global contextual awareness into
, ensuring more discriminative temporal embedding before passing into the spatiotemporal fusion stage.
3.2. Meteorological Interaction Enhancement via Adaptive Graph Attention
Photovoltaic power generation is driven by multiple meteorological variables whose relationships vary depending on weather conditions. Standard GATs assume static and homogeneous feature correlations, which limits their ability to adapt to dynamic coupling effects induced by clouds, irradiance attenuation, thermal efficiency shifts, and wind-driven cooling.
To address these limitations, an adaptive graph attention mechanism is introduced to model heterogeneous, directional, and weather-sensitive meteorological interactions.
Given meteorological feature embeddings
A learnable transformation is applied:
Attention coefficients between variables
and
are computed as
where
is a learnable vector,
denotes the neighbors of variable
, and
is a nonlinear activation function. The updated feature embedding for variable
is obtained through context-aware aggregation:
To further enhance adaptability under diverse atmospheric conditions, a latent weather-state modulation is introduced to dynamically refine attention coefficient magnitudes:
where
denotes a global weather representation extracted from the unified spatiotemporal pathway, and
is a learnable transformation matrix. This modulation enables the model to strengthen humidity–cloud coupling under cloudy regimes, emphasize temperature–power efficiency interactions on clear days, and reinforce wind–thermal dissipation relationships during high-wind-speed periods, resulting in more accurate characterization of meteorological-driven uncertainty.
Such flexible spatial relational learning contributes to improved characterization of meteorological-driven variability, especially when environmental conditions deviate from typical patterns.
Finally, the derived condition-aware interaction representation is forwarded to the weather-adaptive feature fusion module detailed in
Section 3.3, serving as a refined spatial input for unified spatiotemporal learning.
3.3. Weather-Adaptive Feature Fusion
PV generation behavior exhibits pronounced regime dependence driven by atmospheric variations such as cloud movement, seasonal irradiance shifts, and humidity-induced attenuation. Sunny periods generally produce smooth bell-shaped curves governed by deterministic solar elevation, whereas cloudy and rainy periods are characterized by abrupt power drops and high-frequency volatility. These distinct dynamics are shown in
Figure 5, demonstrating that a fixed spatiotemporal fusion strategy cannot maintain forecasting robustness under rapidly evolving weather patterns.
To address these challenges, a weather-adaptive feature fusion mechanism is introduced. Given unified temporal features
from
Section 3.1 and meteorological interaction features
from
Section 3.2, latent weather states are first inferred through a learnable nonlinear transformation:
where
denotes concatenation,
is a nonlinear activation, and
represents the atmospheric state embedding (e.g., sunny-like vs. cloudy-like vs. rainy-like patterns).
The fusion weights are designed to adapt to evolving atmospheric regimes through a feedback-driven update mechanism. Specifically, the weights respond to latent weather-state representations inferred from spatiotemporal features, enabling the dynamic adjustment of contributions from the temporal and meteorological pathways under different operating conditions.
A soft weighting function is then designed to map weather states to dynamic fusion coefficients:
where
and
indicate the contribution of temporal and meteorological pathways, respectively.
The final fused representation is derived as
Thus, under stable irradiance conditions, the model emphasizes trend-dominant temporal structures , while under volatile cloud coverage, the fusion prioritizes meteorology-driven disturbance cues .
To further enhance adaptability, a residual modulation branch is introduced to correct distributional deviations caused by severe weather interventions:
where
compensates for weather-induced representation bias.
Through this weather-aware fusion strategy, MTSA-QRN achieves:
stronger responsiveness to short-term variability in unstable conditions,
improved trend integrity during clear-sky periods,
enhanced robustness against atmospheric uncertainty propagation.
The refined representation
is then supplied to the probabilistic quantile regression and adaptive calibration module in
Section 3.4.
3.4. Probabilistic Prediction and Adaptive Calibration
Benefiting from the stability-constrained adaptive fusion mechanism described in
Section 3.3, the refined spatiotemporal representations provide a robust foundation for subsequent probabilistic prediction and calibration.
To quantify predictive uncertainty in PV forecasting, neural quantile regression is employed to estimate conditional quantiles without assuming any explicit distributional form. However, coverage deviations often arise due to distributional shifts induced by weather transitions or limited training samples, causing the empirical coverage to diverge from the desired confidence level . To mitigate such deviations, a coverage-driven adaptive boundary calibration strategy is devised to refine prediction intervals while preserving sharpness.
Given initial prediction intervals
, the empirical coverage is first evaluated as
The coverage bias can be represented by
A dynamic adjustment term
is iteratively updated using a golden-ratio-guided rule:
The calibrated interval at iteration
is then
This update formulation introduces a contraction property toward the equilibrium boundary
, which supports stable convergence of the calibration process even when weather-driven uncertainty fluctuates. As highlighted in
Figure 6, the empirical coverage gradually approaches the target level (90% shown here) without aggressive interval expansion, indicating a balance between reliability and informativeness.
To establish a realistic testing scenario, the experiments are conducted using real operational data from a 40 MW grid-connected PV plant in Northern China. Moreover, the adaptive calibration benefits from the enhanced interaction modeling described in
Section 3.2 and the weather-aware feature refinement introduced in
Section 3.3, which together provide more robust uncertainty structures for interval adjustment. To illustrate this collaborative effect,
Table 2 presents ablation results comparing different spatial dependency modeling strategies. The results show that incorporating the adaptive GAT leads to lower CRPS and narrower PINAW while maintaining or improving PICP, indicating that more expressive representation learning contributes to higher-quality interval forecasts with reliable coverage.
Through this coverage-guided refinement mechanism, the MTSA-QRN establishes a closed uncertainty-learning loop that compensates for extreme weather fluctuations and potential distribution mismatch. As a result, the proposed framework shows the potential to offer more trustworthy and well-calibrated probabilistic forecasts for risk-aware operational decision-making in real PV systems.
4. Experimental Results and Discussion
This section presents a comparative evaluation of the proposed MTSA-QRN model against representative probabilistic forecasting baselines. The analysis is based on real monitoring and supervisory control and data acquisition measurements collected from a 40 MW grid-connected photovoltaic plant in northern China, recorded at a 15 min temporal resolution, which is consistent with typical operational requirements for ultra-short-term PV forecasting. All models are evaluated under identical forecasting settings, and performance is assessed in terms of probabilistic accuracy, interval reliability, and predictive sharpness.
4.1. Evaluation Metrics
Probabilistic forecasting performance is evaluated using three widely recognized metrics that jointly assess distributional accuracy, reliability, and sharpness. The Continuous Ranked Probability Score (CRPS) is used to measure the overall discrepancy between a forecast cumulative distribution
and the actual observation
, expressed as
A lower CRPS value indicates a forecast distribution that more closely matches the observed outcome.
To assess the reliability of uncertainty quantification, the Prediction Interval Coverage Probability (PICP) is calculated as the proportion of observations falling within the prediction interval
:
A well-calibrated forecast yields PICP values close to the nominal confidence level (e.g., 90%).
Meanwhile, the Prediction Interval Normalized Average Width (PINAW) quantifies the informativeness of prediction intervals:
where
normalizes interval width with respect to the data range.
Smaller PINAW indicates sharper prediction intervals while retaining comparability across datasets and power scales.
Taken together, CRPS rewards accurate probabilistic distributions, PICP measures uncertainty reliability, and PINAW evaluates the usefulness of the produced intervals in operational decision-making.
4.2. Multi-Scale Decomposition of Historical PV Power
Historical PV power contains temporal components spanning multiple frequency scales, influenced by rapid cloud-motion fluctuations, diurnal irradiance cycles, and slowly varying atmospheric processes. To disentangle these behaviors, the historical power series is decomposed using VMD, which produces a finite set of intrinsic mode functions (IMFs) representing progressively decreasing frequency bands.
The number of decomposition modes is determined using an elbow-based criterion, which suggests an appropriate decomposition with seven intrinsic mode functions (IMFs).
Figure 7 illustrates the original PV power series together with the resulting IMFs.
To identify the components that contribute most effectively to forecasting performance and to suppress redundant or noisy information, the predictive relevance of each IMF is further evaluated using an ensemble of six complementary dependency measures, including LASSO regression, Maximal Information Coefficient (MIC), Random Forest (RF) feature importance, Recursive Feature Elimination (RFE), Ridge regression, and Pearson correlation. Each IMF is assigned a normalized score under each criterion, and the average of these scores is used as a comprehensive importance index.
As shown in
Figure 8, IMF
6 and IMF
7 consistently exhibit relatively low relevance scores (below the composite threshold of 0.2), suggesting that they mainly capture high-frequency components with limited contribution to predictive performance. In contrast, IMF
1–IMF
5 show substantially higher relevance and are therefore retained as input features for the forecasting model.
The reconstruction results in
Figure 9 indicate that the selected five IMFs preserve the dominant temporal structure of the original PV power series, achieving a reconstruction coefficient of determination of 0.982, which confirms that the essential dynamics are well retained after feature selection.
4.3. Overall Probabilistic Forecasting Performance and Layer-Wise Ablation Analysis
This subsection first evaluates the overall probabilistic forecasting performance of the proposed MTSA-QRN against representative baseline models, and then quantitatively examines the contribution of key components through a layer-wise ablation analysis covering structural modeling, adaptive coordination, and reliability calibration.
Under the 90% prediction interval, MTSA-QRN achieves the lowest CRPS (0.0400) prior to calibration, substantially outperforming Quantile-GRU (0.0908), Quantile-LSTM (0.0995), and Quantile-Transformer (0.0954). However, its PICP is 0.8140, which is below the nominal 90% confidence level. After applying adaptive the CQR calibration, the PICP of MTSA-QRN increases to 0.9053, while the CRPS rises only slightly to 0.0453 and the PINAW becomes 0.3870, remaining considerably smaller than that of Quantile-SVR (0.5331). These results indicate a favorable trade-off between reliability and interval sharpness (
Table 3).
Table 4 presents the layer-wise ablation results of MTSA-QRN. In the structural layer, removing the Transformer or TCN leads to a substantial increase in CRPS (to 0.1097 and 0.1127, respectively), underscoring the importance of multi-scale temporal modeling. Removing the GAT module increases CRPS to 0.0554 and noticeably enlarges the prediction interval width (PINAW = 0.5059), suggesting that modeling meteorological dependency contributes to improved distributional accuracy. In the adaptive layer, disabling the weather-adaptive module or the fusion gate results in pronounced performance degradation, with CRPS increasing to 0.1338 and 0.1434, respectively, highlighting the importance of adaptive feature coordination under varying weather conditions. In the calibration layer, removing CQR increases CRPS to 0.1284 and leads to less reliable coverage, indicating that explicit reliability calibration plays a critical role in improving the quality of probabilistic forecasts.
Overall, the results presented in
Table 3 and
Table 4 indicate that the superior probabilistic performance of MTSA-QRN arises from the coordinated effects of structural spatiotemporal modeling, adaptive scenario-aware integration, and reliability-driven calibration.
4.4. Calibration Performance Analysis
Reliable probabilistic forecasts require prediction intervals that accurately reflect the uncertainty of PV power generation under varying weather conditions. Although MTSA-QRN effectively models the predictive distribution, its raw quantile outputs show miscalibration relative to nominal confidence levels, a behavior commonly observed in deep models trained with the pinball loss.
To enforce the statistical validity of prediction intervals, the proposed framework incorporates an adaptive conformal calibration mechanism.
Figure 10 compares the target and achieved coverage rates before and after calibration for confidence levels ranging from 80% to 95%. Prior to calibration, all PICP values fall below nominal levels, indicating a tendency toward under-dispersion. After calibration, the achieved coverage closely aligns with target levels across all confidence intervals, confirming the effectiveness of the proposed strategy in improving reliability.
Importantly, interval expansion during the calibration process remains moderate, as evidenced by
Table 3, demonstrating a favorable trade-off between interval width and confidence satisfaction. Unlike baseline models that require substantial interval enlargement to achieve comparable coverage, MTSA-QRN benefits from a more coherent representation of distributional uncertainty, thereby preserving competitive predictive sharpness.
Overall, this analysis highlights the benefit of integrating model-based uncertainty learning with adaptive statistical calibration, ensuring the predictive credibility required for operational decision-making in PV power systems.
5. Conclusions and Future Work
This study proposes a MTSA-QRN for uncertainty-aware short-term photovoltaic power forecasting. The developed model addresses three major challenges in PV forecasting, namely the coexistence of multi-scale temporal patterns, nonlinear dependencies among meteorological variables, and weather-driven fluctuations. By integrating multi-scale decomposition, dilated temporal convolution with self-attention refinement, spatial dependency modeling via a graph attention mechanism, weather-adaptive feature transformation, and multi-quantile prediction equipped with adaptive calibration, the framework enables a comprehensive characterization of PV generation dynamics.
The experimental results confirm that the proposed MTSA-QRN generates more reliable and sharper probabilistic forecasts than competitive baseline models. In particular, learning spatial correlations among meteorological variables improves the ability to represent uncertainty under variable weather conditions, while the adaptive calibration strategy effectively aligns empirical coverage rates with target confidence levels. These findings demonstrate that the proposed probabilistic forecasting framework provides dependable uncertainty information for data-driven grid dispatching, reserve scheduling, and operational risk management.
Although the present work demonstrates promising performance and practical relevance, several avenues for further research remain. Future studies may incorporate physical constraints associated with PV conversion mechanisms to enhance interpretability and robust extrapolation, explore spatiotemporal extensions to jointly model multiple geographically dispersed PV plants, and investigate online learning strategies to address long-term distribution shifts in meteorological inputs. In addition, integrating economic evaluation metrics may provide deeper insights into the benefits of uncertainty-aware forecasting in electricity markets.
In summary, MTSA-QRN establishes a unified and data-efficient framework for short-term PV power forecasting with credible uncertainty quantification, supporting reliable renewable energy integration and operational decision-making in modern power systems.