Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management

Shu, Quanzhuo; Wang, Qingwang; Cao, Yueqian; Li, Binghao

doi:10.3390/su18126194

Open AccessArticle

Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management

¹

School of Transportation and Civil Engineering, Nantong University, Nantong 226019, China

²

Faculty of Land Resource Engineering, Kunming University of Science and Technology, Kunming 650500, China

³

State Key Laboratory of Remote Sensing and Digital Earth, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Sustainability 2026, 18(12), 6194; https://doi.org/10.3390/su18126194

Submission received: 19 April 2026 / Revised: 11 June 2026 / Accepted: 13 June 2026 / Published: 16 June 2026

Download

Browse Figures

Versions Notes

Abstract

Accurate photovoltaic (PV) power forecasting is pivotal for facilitating the integration of renewable energy into modern power systems and supporting sustainable energy development. However, existing methods often rely on single deep learning architectures, require complex preprocessing, suffer from training instability, and lack the ability to capture long-range temporal dependencies. To address these issues, this study develops and compares two hybrid deep learning models—ConvTempNet and DilaTransNet—for hourly PV energy prediction using meteorological and temporal data from two Portuguese PV stations. Quantitative results show that the optimized ConvTempNet achieves superior hourly predictive accuracy with an hourly RMSE of 1.16 kWh and an R² of 0.95 at Tartaruga (2.66 kWh, R² = 0.95 at Zarco). Systematic evaluations were conducted, including dropout ablation (a systematic test of different dropout rates to assess model robustness and regularization effects) (0.2–0.4), performance assessment using RMSE, R², MAE, and MAPE, and sensitivity analysis to assess predictive accuracy and variable importance. Results show that the optimized ConvTempNet yields superior hourly accuracy with an hourly RMSE = 1.16 kWh and an R² = 0.95 at Tartaruga (2.66 kWh, R² = 0.95 at Zarco). The tuned DilaTransNet shows stronger robustness to moderate dropout. Solar radiation is the dominant input variable, while temperature, humidity, and hour affect the two models differently. The two models exhibit complementary strengths, supporting site-specific parameter optimization for reliable PV forecasting.

Keywords:

photovoltaic power forecasting; hybrid deep learning; sustainable energy management

1. Introduction

The rapid expansion of photovoltaic (PV) systems across the globe underscores the pressing need for precise and efficient methods of forecasting PV power output [1,2]. As these systems become integral to the power grid, accurate predictions are essential not only for maintaining grid stability but also for optimizing energy distribution and consumption [3,4]. While significant progress has been made in the field of PV power forecasting, challenges such as limited historical data and the inherently unpredictable nature of solar irradiance make the task of creating robust forecasting models exceedingly difficult [5]. Dynamic fluctuations in solar energy availability further complicate the development of models that can reliably predict PV power output over various time scales [6]. Accurate photovoltaic (PV) energy forecasting is pivotal for grid stability and energy management amid the global expansion of PV systems. However, existing studies often exhibit limitations such as reliance on single deep learning architectures, complex preprocessing requirements, potential training instability, and insufficient capability to capture long-range temporal dependencies.

Traditional forecasting methods, including statistical models, machine learning algorithms, and physical simulations, have been instrumental in establishing a fundamental understanding of how PV systems generate power. Such approaches have provided valuable insights into the patterns and trends of PV power generation. However, their effectiveness is often curtailed by the necessity of substantial historical data for model training and this limitation can impede the accurate prediction of PV power output, which is crucial for the integration and management of renewable energy within the existing electrical grid infrastructure [7]. Moreover, these methods may struggle to fully encapsulate the intricate complex variations in solar power output, which can be influenced by a myriad of factors, including weather conditions, geographical location, and temporal variations. These challenges highlight the need for more adaptive and sophisticated models that can better accommodate the complexities of solar energy production.

The advent of deep learning (DL) has indeed revolutionized the field of PV power forecasting, introducing a transformative approach that transcends the limitations of conventional methods [8,9]. DL models, with their sophisticated architectures, are uniquely positioned to decipher and model these intricate nonlinear relationships. DL offers significant advantages in PV power forecasting, particularly in its ability to process high-dimensional data and identify complex patterns that traditional methods may overlook. It automatically extracts features and learns representations at multiple abstraction levels, crucial for understanding the intricacies of solar irradiance and its effects on power output [10]. Moreover, DL’s transfer learning capability allows models to leverage knowledge from related tasks, improving accuracy in environments with limited data or frequent changes, thereby enhancing generalization and model adaptability. Furthermore, DL models, especially recurrent ones, excel at capturing temporal dependencies, which is essential for forecasting solar power output due to the diurnal and seasonal variations in solar energy.

Among these DL models, the hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) model excels in handling both temporal and spatial characteristics of PV power data [11]. The LSTM component is adept at capturing and remembering information over extended periods, allowing the model to discern patterns and trends that evolve over time and are influenced by long-term weather patterns and seasonal variations [12,13]. The CNN aspect, on the other hand, complements this by identifying spatial hierarchies, such as the arrangement and positioning of solar panels, which affect overall power output. By combining these two architectures, the CNN-LSTM model creates a comprehensive and nuanced understanding of PV power generation patterns [14,15]. In addition, the Temporal Convolution Network (TCN) adds a higher level of sophistication to forecasting by integrating a transformer component specifically designed to handle sequences with long-range dependencies. This ability is crucial for accurately predicting the volatile and intermittent behavior of solar irradiance. By capturing complex interdependencies across various time scales, the TCN significantly outperforms traditional models in solar power generation forecasting [16,17].

Recent studies have increasingly explored hybrid deep learning architectures to improve renewable energy forecasting by combining complementary modeling components. For instance, the Time2Vec–BiTCN–BiGRU framework integrates temporal convolution and recurrent structures to capture periodic patterns in photovoltaic and wind power generation [18]. Similarly, the xLSTM–TCCNN model incorporates numerical weather prediction data with enhanced recurrent and convolutional networks to better address the intermittency of renewable energy generation [19]. More recently, the multi-scale convolutional Kolmogorov–Arnold network (MCKAN) has been proposed to enhance multi-step wind and solar power forecasting by combining multi-scale feature extraction with attention mechanisms [20]. These hybrid approaches demonstrate the effectiveness of integrating different deep learning paradigms for modeling the nonlinear and time-dependent characteristics of renewable energy generation.

Despite significant progress in PV power forecasting via deep learning methods, several challenges remain. Many existing studies focus primarily on improving prediction accuracy through increasingly complex hybrid architectures, while relatively less attention has been paid to understanding how meteorological variables influence PV generation and model performance across different stations. Furthermore, the interpretability of meteorology-driven PV forecasting models remains limited in many deep learning frameworks.

In this paper, a dropout-regularized CNN-LSTM with batch normalization (hereafter referred to as ConvTempNet), as well as a TCN integrating dilated convolutions and Transformer (hereafter referred to as DilaTransNet) is proposed to address the complexities of photovoltaic power forecasting. ConvTempNet was employed to extract temporal patterns influenced by weather conditions, leveraging its strength in combining spatial and sequential dependencies. DilaTransNet, on the other hand, provides a robust framework for capturing long-range temporal relationships across multiple time scales, which are essential for accurately forecasting weather-dependent photovoltaic output. This dual approach seeks to outperform traditional forecasting models and enhance the predictability of solar energy output, ensuring greater grid stability and optimized energy management. Unlike previous studies that focus on a single architecture (Table 1), this study performs a comparative analysis of two distinct deep learning paradigms (Convolutional–Recurrent vs. Transformer-based) to identify their specific suitability for different weather regimes. The proposed ConvTempNet and DilaTransNet hold clear advantages over existing hybrid models by avoiding complex attention mechanisms, heavy optimization, and extra signal decomposition preprocessing. ConvTempNet uses a streamlined CNN–LSTM with batch normalization and adaptive dropout to enhance stability and efficiency. DilaTransNet leverages dilated convolutions and a Transformer to capture both local and long-range temporal dependencies without additional data processing. These designs yield higher accuracy, better robustness, and lower computational complexity compared with prior structures. Compared with existing hybrid structures summarized in Table 1, ConvTempNet avoids overcomplicated attention modules and extra signal preprocessing such as VMD, reducing computation cost while stabilizing training via rationally arranged dropout. DilaTransNet leverages dilated convolution + lightweight Transformer to capture long-range dependencies without multiple metaheuristic optimization modules used in recent hybrid frameworks. Combined with targeted meteorology-oriented feature sensitivity experiments, the two proposed designs balance prediction accuracy, training robustness and computational efficiency, enabling better adaptability to weather-induced fluctuation of PV output than existing alternatives. Additionally, the study incorporates correlation analysis and feature exclusion experiments to quantify the contributions of meteorological and temporal variables to model performance. By integrating multi-station observations with interpretable feature analysis, the proposed framework not only evaluates forecasting accuracy but also provides insights into the meteorological factors governing photovoltaic energy production. Reliable forecasts enable more efficient integration of intermittent renewable energy resources into power grids, reduce operational uncertainties, and support the development of low-carbon and resilient energy systems. Therefore, the present study contributes to sustainable energy management and renewable energy deployment.

2. Dataset

Ilias et al. [26] produced a dataset of two PV systems (Tartaruga and Zarco) located in Portuguese cities, comprising hourly measurements of PV output and corresponding weather records (Table 2). The PV output originated from solar installations within a Portuguese energy community, while the weather data (Figure 1) were sourced from both a local meteorological station (https://www.wunderground.com/, assessed on 10 June 2026) and the Copernicus Atmosphere Data Store (https://ads.atmosphere.copernicus.eu/, assessed on 10 June 2026). The input features selected for the forecasting models include generated energy (Produzida), year, month, day, timestamp, solar radiation, temperature, relative humidity, one-hot encoded representation of the month in the year (enabling the model to distinguish seasonal changes and generalize across different time periods), and sine/cosine transformations of the hour of the day (allowing the model to capture periodic trends in PV power generation effectively).

Following the standard protocol for time-series forecasting, the dataset was strictly partitioned into three disjoint subsets with ratios of 70%, 15%, and 15%, respectively: a training set (for model parameter optimization), a validation set (for hyperparameter tuning and overfitting monitoring during training), and an independent test set (completely held out from training and validation, used exclusively for final performance evaluation).

3. Methods

To clarify the motivation and workflow of the proposed approach, Figure 2 illustrates the overall framework of the meteorology-driven photovoltaic energy forecasting system developed in this study. The framework integrates meteorological observations and historical PV output to construct predictive models using CNN-LSTM and TCN architectures. The design aims to capture both nonlinear meteorological influences and temporal dependencies in PV generation while maintaining model interpretability through correlation analysis and feature contribution experiments. ConvTempNet tests the efficacy of spatio-temporal feature extraction, while DilaTransNet tests the efficacy of attention mechanisms for long-term dependencies.

Compared to traditional forecasting techniques, ConvTempNet and DilaTransNet multiple key advantages. Firstly, their ability to be generalized from limited data makes them particularly well-suited for regions or systems with insufficient historical data [27]. Secondly, their strong representation learning capability enables the models to capture complex nonlinear relationships from large datasets during the training process, thereby improving their adaptability and prediction accuracy under various meteorological conditions. Lastly, the hybrid nature of these models enables a more nuanced understanding of the multifaceted factors influencing PV power output, leading to more precise and reliable forecasts [28,29]. The hyperparameters used in this study are summarized in Table 3. These parameters were determined through a combination of empirical settings and preliminary experiments to ensure stable training and reliable forecasting performance. Several candidate configurations were tested for key parameters such as the learning rate, batch size, number of convolutional filters, and hidden units. The final parameter values were selected based on their ability to achieve fast convergence and lower validation loss during model training. A total of 12 typical candidate configurations were tested for key hyperparameters including learning rate, batch size, filter number, and dropout rate. The final selection was based on the lowest validation loss and fastest convergence speed. Time-series cross-validation was adopted to avoid data leakage and ensure stable and generalizable parameter selection. Extensive sensitivity tests were conducted for sequence lengths ranging from 5 to 20. The length of 10 was selected because it achieved the optimal balance between prediction accuracy, model convergence, and computational cost. Longer sequences introduced redundant temporal information and increased training instability, while shorter sequences failed to capture sufficient hourly weather–PV generation dependencies.

The experiments were conducted using Python 3.10 with the PyTorch deep learning framework. Model training was performed on a workstation equipped with an NVIDIA RTX 4090 GPU (24 GB VRAM), and an AMD 16-core CPU, which enabled efficient handling of the deep learning computations. All experiments run on Python 3.10 and PyTorch 2.3.1 with an RTX 4090 GPU. ConvTempNet has fewer network parameters and requires around 1.2 min for full training; DilaTransNet with stacked Transformer blocks consumes roughly 2.1 min per training round. Benefiting from lightweight hybrid designs, both models are readily scalable to longer time sequences and multi-site datasets without excessive computational burden.

Outliers were removed by the 3σ rule and missing data were linearly interpolated. All inputs are normalized to [0, 1] via min-max scaling. Hyperparameters were preliminarily screened by grid search and further optimized by Artificial Lemming Algorithm and Crested Porcupine Optimizer. Early stopping and layered dropout were deployed to stabilize training convergence and suppress overfitting. To prevent overfitting and improve training stability, an early stopping strategy was employed during model training. Training was terminated when the validation loss did not improve for a predefined number of epochs, ensuring that the model maintained good generalization performance.

3.1. ConvTempNet

The convolutional operation

C o n v

in the CNN component (Figure 3) extracts spatial features from the input x:

C o n v (x) = (x * ω) + b

(1)

where

ω

is the convolution kernel (or filter), and

b

is the bias term. The convolution operator

*

extracts local patterns by sliding the kernel over the input. The output of the convolution operation is passed through the ReLU activation function to introduce non-linearity:

C o n v 1 D (x) = R e L U (C o n v (x) + b)

(2)

The LSTM component (Figure 3) is responsible for capturing temporal dependencies within the sequential data. It consists of the following gates controlling the flow of information.

The candidate cell state

{\tilde{c}}_{t}

is computed via the activation function

t a n h

, weight matrix

W_{c}

, previous hidden state

h_{t - 1}

, current input

x_{t}

, and bias term

b_{c}

:

{\tilde{c}}_{t} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(3)

Then, the cell state

c_{t}

is updated via the forget gate

f_{t}

and the input gate

i_{t}

:

c_{t} = f_{t} \cdot c_{t - 1} + i_{t} \cdot {\tilde{c}}_{t}

(4)

Subsequently, the output gate calculates the final hidden state

h_{t}

through the sigmoid function

σ

with weight matrix

W_{o}

and bias term

b_{o}

:

h_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}) \cdot \tanh (c_{t})

(5)

After the LSTM component processes the sequential data, the output is passed to two fully connected layers (Figure 3) to produce the final predicted photovoltaic power

y

:

y = W \cdot h_{t} + b

(6)

3.2. DilaTransNet

As a key component of the DilaTransNet, Temporal Block (Figure 4) is designed to extract local features from time-series data by applying two consecutive 1D convolutional layers (

C o n v_{1}

and

C o n v_{2}

) to the input feature map

x

:

o u t p u t = R e L U (C o n v_{2} (D r o p o u t (R e L U (C o n v_{1} (x) + b_{1}))) + b_{2}) + r e s i d u a l

(7)

in which the dropout mechanism is incorporated to enhance generalization by randomly deactivating neurons, thereby mitigating overfitting. Each convolutional layer is followed by an ReLU activation function, introducing non-linearity to capture complex relationships in the data. The output from the block is then combined with a residual connection, where the original input is added to the convolutional output, preserving the input features while facilitating gradient flow during backpropagation. Bias terms,

b_{1}

and

b_{2}

, are included in the convolutional operations to further refine the output. Stacking multiple Temporal Block layers in the DilaTransNet progressively extracts higher-level features, enabling robust modeling of temporal dependencies in the data.

The Transformer Layer complements the Temporal Block by focusing on long-range dependencies within the data. At its core, the self-attention mechanism (Figure 4) processes three matrices—

Q

(Query),

K

(Key), and

V

(Value)—which are derived from the input sequence through linear transformations:

A t t e n t i o n (Q, K, V) = S o f t m a x (\frac{Q \cdot K^{T}}{\sqrt{d_{k}}}) V

(8)

of which

d_{k}

is the dimension of the key vectors and used for scaling. This process allows the model to weigh the importance of different time steps in the input sequence, capturing intricate temporal patterns.

The feed-forward network (FFN, Figure 4) refines the features further by applying the

R e L U

activation to the input:

F F N (x) = R e L U (W_{1} \cdot x + b_{1}) \cdot W_{2} + b_{2}

(9)

where

W_{1}

and

W_{2}

are weight matrices, and

b_{1}

and

b_{2}

are biases. The output of the Transformer Layer is fed into subsequent layers, forming a stack in the Transformer Encoder. Each layer builds upon the refined representation from the previous layer, enhancing the model’s ability to capture hierarchical relationships in time-series data. Overall, the Temporal Block and Transformer Layer modules enable the DilaTransNet to model both local and long-range dependencies effectively.

In this study, long-range dependencies denote long-span temporal correlations and periodic patterns within the hourly input sequence, rather than long-horizon forecasts over days or weeks. The DilaTransNet is constructed to capture such sequential patterns via dilated convolutions and Transformer. In the context of this hourly short-term PV forecasting task, “long-range temporal dependencies” denote the continuous sequential correlations and periodic patterns across multiple consecutive hours in the input time series, rather than long horizon predictions over days or weeks. This term describes the model’s ability to capture hourly scale persistent and periodic trends within the input window, not the length of the prediction horizon.

3.3. Evaluation Metrics

The models’ performance was evaluated via four metrics: root mean squared error (RMSE), mean absolute error (MAE), determination coefficient (R²), and mean absolute percentage error (MAPE):

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(10)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(11)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(12)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}| \times 100 %

(13)

where

n

is the number of records,

y_{i}

represents the actual value of the ith measurement,

{\hat{y}}_{i}

marks the predicted value of the ith measurement, and

\bar{y}

denotes the mean of the actual values.

RMSE quantifies the average magnitude of the squared differences between predicted and observed values, emphasizing larger errors due to the squaring operation. MAE measures the average absolute differences, offering a straightforward interpretation of model error. The R² metric assesses the proportion of variance in the observed data that is captured by the model, providing an indication of its explanatory power, with values closer to 1 signifying better performance. MAPE evaluates the relative prediction accuracy by calculating the percentage deviation between observed and predicted values, offering a scale-independent measure that facilitates comparison across datasets. Together, these metrics provide a comprehensive assessment of the models’ accuracy and reliability.

In addition to conventional evaluation metrics, the statistical significance of the correlation between predicted and observed photovoltaic energy outputs was assessed through the Pearson correlation test. The associated p-values were computed using a two-sided t-test.

4. Results and Discussion

4.1. Ablation Test

Figure 5 illustrates the impact of evolved dropout rates on evaluation metrics of ConvTempNet and DilaTransNet at two stations. Dropout, a widely recognized regularization technique, is applied to reduce overfitting by randomly deactivating neurons during training. However, the results reveal that while moderate dropout rates enhance model generalization, excessive dropout leads to significant performance degradation, impairing the models’ predictive capability. A dropout rate of 1.0 is set solely for boundary illustration and does not represent any applicable or recommended parameter for practical photovoltaic forecasting. It helps visualize the performance boundary when all neurons are disabled, clearly demonstrating the transition from effective regularization to severe underfitting. This extreme value provides a full reference for analyzing how dropout intensity affects model behavior.

For the ConvTempNet, moderate dropout rates of 0.2–0.4 yield relatively stable performance, with lower RMSE (MAE) of approximately 1.5 (0.7) kWh at Tartaruga and 2.6 (1.3) kWh at Zarco, while R² remains high above 0.9 and MAPE stays below 40%. However, as dropout rates increase beyond 0.6, RMSE (MAE) rises sharply, reaching approximately 6 (5) kWh at Tartaruga and 13 (10) kWh at Zarco for a dropout rate of 1.0. This is accompanied by a significant decline in R², dropping to −0.4, and MAPE increasing dramatically to about 300%, reflecting severe underfitting and a loss of predictive capability. Such trends align with findings that excessive rates hinder a model’s ability to capture complex temporal patterns [30]. It should be noted that these values correspond to the intermediate sensitivity experiments. The final model configuration, obtained after hyperparameter tuning with an optimal dropout rate of 0.1 (Table 3), achieves a lower RMSE of 1.16 kWh at Tartaruga.

The DilaTransNet exhibits slightly better resilience to higher dropout rates, as RMSE (MAE) remains low—around 1.5 (0.5) kWh at Tartaruga and 2.5 (1.2) kWh at Zarco—when the dropout rate is 0.2. R² remains above 0.9, and MAPE stays below 50%. However, as dropout increases, RMSE (R²) rises (drops) significantly to the maximum (minimum) at both stations, reflecting the adverse effects of over-regularization. The observations reveal that DilaTransNet architectures benefits from moderate dropout rates to maintain predictive accuracy while excessive rates reduce the model’s ability to capture long-term dependencies critical for accurate forecasting [22,31].

Notably, the differences between Tartaruga and Zarco highlight the influence of site-specific factors, such as weather conditions and PV system configurations, on model performance. For instance, while Tartaruga maintains lower RMSE under moderate dropout, Zarco exhibits higher baseline errors, emphasizing the importance of tailoring dropout rates and model parameters to specific datasets for optimal forecasting accuracy [32].

4.2. Model Performance

Figure 6 illustrates the evolution of training and validation loss over 50 epochs for ConvTempNet and DilaTransNet at two stations, which focuses on the model’s convergence behavior during training, while all final performance evaluations are conducted on a strictly held-out, independent test set that was not used for training or hyperparameter tuning. Both models show a steady decline in training loss during the initial epochs, reflecting effective learning and parameter optimization. By approximately the 10th epoch, the training loss flattens, indicating minimal further improvements with additional training. The validation loss, assessing generalization to unseen data, stabilizes within the first 10 epochs and remains consistently lower than the training loss throughout most of the training process, which highlights efficient convergence and robust generalization, with no indication of overfitting. At Zarco, the validation loss is slightly lower than at Tartaruga, potentially reflecting better prediction accuracy or less complex data variability, which raises the possibility that station-specific factors, such as environmental or data characteristics, may influence models’ ability to generalize across stations. Slight fluctuations in validation loss across epochs are observed in both models, possibly because of sensitivity to specific data features or minor adjustments during training. Despite the variations, the models achieve low final loss values, denoting reliable learning and strong predictive performance.

The scatter plots (Figure 7) display the relationship between predicted and true PV energy outputs for ConvTempNet and DilaTransNet at two stations. For ConvTempNet, the results at Tartaruga demonstrate strong predictive accuracy, with most points tightly clustered around the diagonal line. The RMSE is low at 1.19 kWh, and the R² is 0.95, reflecting that the model effectively captures the variance in PV energy outputs. At Zarco, the ConvTempNet produces slightly higher errors, with an RMSE of 2.59 kWh. Despite these deviations, the overall performance remains reliable, as reflected by the high R² of 0.96. These patterns imply ConvTempNet model’s ability to generalize well across different stations [21,23,33], although minor inaccuracies at Zarco may be attributed to greater variability in the data. DilaTransNet demonstrates slightly weaker performance compared to ConvTempNet. At Tartaruga and Zarco, the RMSE increases to 1.24 and 2.74 kWh, respectively, and the spread of points becomes more pronounced, highlighting greater prediction errors. The R² values, although still relatively high, fall short of those achieved by ConvTempNet, underscoring the DilaTransNet model’s reduced ability to capture finer details in the data, particularly at Zarco. Quantitative comparisons show that ConvTempNet achieves the fastest inference speed (2.3 ms per sample) and the strongest stability under cloudy and fluctuating radiation conditions. Under variable meteorological conditions, the RMSE fluctuation range is controlled within 0.15 kWh, indicating high robustness. For practical grid applications, this efficiency and stability support real time prediction and reliable power dispatching. Furthermore, the Pearson correlation tests indicate that the correlations between predicted and observed PV outputs are statistically significant for both stations (p < 0.01), confirming the robustness of the forecasting results. Statistical comparison using pairwise t-tests shows that the accuracy differences between ConvTempNet and DilaTransNet are statistically significant at the p < 0.05 level across both stations. Although the numerical gaps in RMSE, MAE, and R² appear small, the consistent superiority of ConvTempNet is statistically confirmed. For real-world PV forecasting, this statistically significant advantage supports more stable power scheduling and more reliable grid operation.

The relatively high full-dataset MAPE ranging from 43% to 51% is a typical statistical artifact of MAPE when facing massive zero nighttime PV output, and all zero/near-zero samples are kept intact for error calculation in this study without filtering. Minimal absolute prediction errors for zero nighttime power trigger sharp growth in percentage error and raise the overall MAPE for all competing models. For objective performance evaluation unaffected by zero-value defects of MAPE, RMSE, MAE and coefficient of determination R² serve as core complementary metrics, which are not mathematically sensitive to near-zero true values. Superior performance on these robust indicators confirms the proposed models’ reliable prediction precision for daytime effective power, ensuring practical usability for real photovoltaic operational scheduling despite the inflated aggregate MAPE.

To validate the superiority and innovation of the proposed architectures, comparative experiments with three widely used baseline models (CNN-BiLSTM, GRU, MLP) across two PV stations were added (Figure 8). Compared with conventional CNN, GRU, and MLP models, CNN-BiLSTM combines spatial feature extraction and bidirectional temporal dependency learning, representing a stronger benchmark for photovoltaic forecasting. All baseline models exhibit significantly worse predictive performance than the proposed models. For Tartaruga, the baseline models have higher RMSE (1.18–2.00 kWh vs. ConvTempNet < 1.2 kWh), higher MAE, and lower prediction accuracy. For Zarco, the baseline models show even more severe performance degradation, with higher errors (MAE > 1.31 kWh vs. ConvTempNet < 1.2 kWh) and more scattered points deviating from the ideal diagonal fitting line, especially under large photovoltaic output conditions. These results fully verify that the proposed task-specific architectural reconfiguration, lightweight optimization, and complementary dual-model design deliver substantial performance gains over conventional unmodified single/hybrid network structures, demonstrating the innovation and effectiveness of the proposed ConvTempNet and DilaTransNet.

4.3. Effects of Hyperparameter Optimization

To further improve model performance, two meta-heuristic optimization algorithms—the Artificial Lemming Algorithm (ALA) and the Crested Porcupine Optimizer (CPO)—were introduced to optimize the key hyperparameters of ConvTempNet and DilaTransNet. Based on these strategies, four optimized models were constructed: ALA-ConvTempNet, ALA-DilaTransNet, CPO-ConvTempNet, and CPO-DilaTransNet. Additional experiments were conducted to evaluate the effectiveness of the optimization approaches at the Tartaruga and Zarco photovoltaic stations (Figure 9 and Figure 10).

For the Tartaruga station, ALA optimization improves the forecasting performance of both models. The RMSE, MAE, and MAPE of ConvTempNet decrease from 1.19 to 1.16, 0.66 to 0.60, and 43.10% to 33.18%, respectively. Meanwhile, the RMSE and MAE of DilaTransNet decrease from 1.24 to 1.19 and from 0.63 to 0.56, respectively, while the coefficient of determination (R²) increases from 0.94 to 0.95. For the Zarco station, the ALA-optimized models maintain high prediction accuracy despite minor changes in error metrics. In particular, ConvTempNet preserves a high coefficient of determination (R² ≈ 0.95), demonstrating stable predictive capability across stations.

The influence of CPO optimization shows a similar trend. At Tartaruga, CPO improves the performance of DilaTransNet, with RMSE decreasing from 1.24 to 1.18 and MAE decreasing from 0.63 to 0.60, while R² increases from 0.94 to 0.95. ConvTempNet exhibits only minor variations after CPO optimization, indicating that the baseline configuration already provides stable predictive performance. For Zarco, both models maintain strong predictive capability after CPO optimization, with R² remaining close to 0.95. Although slight fluctuations occur in RMSE and MAE, the optimized models demonstrate robust forecasting performance across different meteorological conditions.

Although the RMSE reductions from ALA and CPO optimization are modest (approximately 0.03–0.05 kWh), these improvements are statistically significant at the p < 0.05 level based on pairwise t-tests. In real-world PV forecasting, such small but consistent error reductions effectively improve power scheduling stability, reduce reserve capacity costs, and enhance the reliability of grid-connected operation, making them practically meaningful for engineering applications.

4.4. Input Correlation and Sensitivity

Figure 11 highlights the impact of excluding individual input variables on evaluation metrics for ConvTempNet and DilaTransNet at the Tartaruga and Zarco stations, with ΔMetrics offering insights into the importance of each variable. Solar radiation (R) emerges as the most critical variable at both stations. Its exclusion leads to a sharp increase in RMSE and MAE, particularly at Zarco, where RMSE rises by approximately 1.1 kWh and MAE by around 0.5 kWh for the ConvTempNet. This aligns with the correlation coefficient in Figure 12, where R exhibits a near-perfect positive correlation (~1.0) with PV energy output, confirming its dominant role in driving model accuracy [34].

Relative humidity (U) and temperature (T) also play notable roles; their exclusion results in increases in RMSE and MAE on DilaTransNet model. Figure 12 supports this observation, presenting a strong negative (positive) correlation between U (T) and PV output. This discloses that atmospheric effects captured by U and T are essential for improving predictive performance [35], albeit secondary to R. Interestingly, excluding T leads to a decrease in RMSE and MAE for the ConvTempNet at Zarco. One possible explanation is including T may overlap with the predictive information provided by other variables, such as R or U. This redundancy could result in the model overfitting to temperature-specific noise, leading to less accurate predictions. Removing T likely forces the model to rely more heavily on variables with direct causal relationships to PV energy output, thereby improving performance. Excluding temperature at Zarco reduces RMSE and MAE by 0.12 kWh and 0.08 kWh, respectively, providing direct quantitative evidence of redundant information between temperature and radiation/humidity. This indicates that temperature provides limited unique information under the local meteorological conditions of Zarco, rather than qualitative speculation about overfitting. In practical PV modules, unabsorbed photons contribute to heat buildup and influence module efficiency and power output, which varies under different weather conditions and has been addressed in recent studies [36] which propose efficiency recalibration through radiation control. While our model incorporates T as an input feature, it does not explicitly account for thermal effects from unabsorbed radiation. Future extensions could integrate a module heat transfer model to enhance prediction accuracy in diverse environmental conditions.

The temporal variable hour (H) has the most significant impact on the DilaTransNet, particularly at Zarco. Excluding H causes RMSE to increase by nearly 10 kWh and MAE by approximately 9 kWh, reflecting the DilaTransNet model’s reliance on temporal patterns, although Figure 12 displays a weak positive correlation (~0.25) with PV output. This contrast suggests that while H may not directly drive PV energy generation, it serves as a critical contextual feature for capturing diurnal patterns and time-dependent dynamics essential on DilaTransNet’s performance [37].

Month (M) has minimal effects on ΔMetrics for both models, which is consistent with its near-zero correlation coefficient, implying that M does not strongly influence PV output prediction when other inputs are introduced.

Combining these insights, the analysis confirms the dominant influence of R and H on model performance. ConvTempNet is more sensitive to meteorological variables like R and U, while DilaTransNet relies heavily on temporal inputs such as H. These observations underline the complementary strengths of the two architectures, as well as the importance of prioritizing key inputs to enhance prediction accuracy.

5. Conclusions

This study investigates the effectiveness of advanced deep learning architectures for PV power forecasting under varying meteorological conditions. Two proposed models, ConvTempNet and DilaTransNet, were developed to capture the complex nonlinear relationships between meteorological variables, temporal factors, and PV energy output. Their performance was systematically evaluated against three widely used baseline models to assess their predictive capabilities. The results indicate that ConvTempNet generally achieves superior forecasting accuracy due to its ability to jointly extract spatial and sequential features, while DilaTransNet demonstrates strong robustness in capturing long-range temporal dependencies through dilated convolutions and Transformer-based attention mechanisms.

In addition, heuristic optimization algorithms, including ALA and CPO, were introduced to optimize key hyperparameters of the proposed models. The experimental results demonstrate that these optimization strategies can further improve prediction accuracy and model stability, particularly for DilaTransNet at the Tartaruga station. Sensitivity analysis also highlights that solar radiation is the most influential predictor of PV energy output, while temperature, relative humidity, and temporal variables such as the hour of the day also contribute to forecasting performance by reflecting atmospheric conditions and diurnal generation patterns.

Despite the encouraging results, several limitations should be acknowledged. First, the analysis was conducted using data from two photovoltaic power plants, which may limit the generalizability of the conclusions to other climatic regions or PV system configurations. Detailed information regarding module technology, installed capacity, tilt angle, and azimuth angle was not reported in the original dataset publication and therefore could not be incorporated into the present analysis. Future research will collect multi-region PV datasets covering diverse climatic zones and various installed photovoltaic systems to further verify the universal applicability of ConvTempNet and DilaTransNet. Second, although advanced optimization algorithms were introduced to improve model performance, further studies could explore additional hybrid optimization strategies and larger datasets to enhance model robustness. Meanwhile, the complementary performance characteristics of the two proposed models summarized in our sensitivity analysis still provide referential guidance for short-term PV power forecasting under weather-driven conditions for other similar distributed PV stations. Future work could incorporate probabilistic forecasting frameworks, prediction intervals, or quantile-based approaches to better support operational decision-making. Although correlation and sensitivity analyses provide insights into the importance of individual predictors, advanced explainability techniques such as SHAP values and attention visualization may further improve the interpretability of deep learning forecasts. Future research could extend the framework to multi-horizon forecasting and assess model robustness under extreme weather scenarios, thereby enhancing its applicability to real-world grid operation and energy management systems.

Although the two experimental sites are located in Portugal with analogous coastal climatic conditions, the structural design of the proposed ConvTempNet and DilaTransNet follows a climate-agnostic feature extraction logic. The network takes globally universal meteorological and temporal covariates as input without site-specific empirical parameter tuning tailored to local Portuguese coastal weather patterns. Standardized preprocessing pipeline and embedded regularization strategies further enhance the robustness against distribution shifts arising from varied geographic and meteorological environments. While empirical cross-climate testing is unavailable presently due to limited multi-region measured data, the above design principles lay a theoretical foundation for subsequent cross-site transfer application. The proposed lightweight and robust forecasting frameworks provide reliable technical support for real-world photovoltaic energy management and smart grid operation. The stable short-term PV power prediction results help reduce the uncertainty of renewable power output, assist power departments in reasonable peak regulation and reserve capacity allocation, and lower grid operation costs. Furthermore, the optimized model structures and hyperparameter tuning strategies can be generalized to hourly photovoltaic forecasting tasks in similar regions, facilitating efficient and safe operation of distributed PV power systems. Consequently, the study contributes to ongoing efforts toward sustainable energy development, carbon-emission reduction, and the realization of global energy transition goals.

Author Contributions

Conceptualization, Y.C.; methodology, Y.C.; software, Q.S. and B.L.; validation, Q.S. and B.L.; formal analysis, Q.S. and B.L.; investigation, Q.S. and B.L.; resources, Y.C.; data curation, Q.S., Y.C. and B.L.; writing—original draft preparation, Q.S. and B.L.; writing—review and editing, Q.W. and Y.C.; visualization, Q.S. and B.L.; supervision, Q.W. and Y.C.; funding acquisition, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was jointly funded by the National Natural Science Foundation of China (No. 42301428) and the Open Fund of State Key Laboratory of Remote Sensing and Digital Earth (No. OFSLRSS202306).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors appreciate four reviewers’ valuable comments for improving the manuscript significantly.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following symbols/abbreviations are used in this manuscript:

Symbol/Abbreviation	Meaning/Full Term
$x$	Input feature(s)
$y$	Predicted photovoltaic energy
$f$	Convolution function
$W$	Weight matrix in neural networks
$b$	Bias term in neural networks
$σ$	Activation function (e.g., sigmoid)
$h_{t}$	Hidden state at time $t$ in LSTM
$c_{t}$	Cell state at time $t$ in LSTM
$i_{t}$	Input gate in LSTM
$f_{t}$	Forget gate in LSTM
$o_{t}$	Output gate in LSTM
$Q$	Query matrix in Transformer Layer
$K$	Key matrix in Transformer Layer
$V$	Value matrix in Transformer Layer
$d_{k}$	Dimension of key vectors in self-attention mechanism
N	Number of data points
$y_{i}$	Actual value of the ith measurement
${\hat{y}}_{i}$	Predicted value of the ith measurement
$\bar{y}$	Mean of the actual values
$R$	Solar radiation
$T$	Temperature
$U$	Relative humidity
$H$	Hour of the day
$M$	Month
PV	Photovoltaic
CNN	Convolutional Neural Network
LSTM	Long Short-Term Memory
TCN	Temporal Convolutional Network
BiLSTM	Bidirectional Long Short-Term Memory
VMD	Variational Mode Decomposition
SA	Self Attention
MSADBO	Dung Beetle Optimization Guided by improved Sine Algorithm
AT	Attention Mechanism
BiGRU	Bidirectional Gated Recurrent Unit
HO	Hippopotamus Optimization
RMSE	Root Mean Squared Error
MAE	Mean Absolute Error
R²	Determination Coefficient
MAPE	Mean Absolute Percentage Error
FFN	Feed-Forward Network
ReLU	Rectified Linear Unit

References

Hossain, M.S.; Wadi Al-Fatlawi, A.; Kumar, L.; Fang, Y.R.; Assad, M.E.H. Solar PV High-Penetration Scenario: An Overview of the Global PV Power Status and Future Growth. Energy Syst. 2024, 17, 809–865. [Google Scholar] [CrossRef]
Nijsse, F.J.; Mercure, J.-F.; Ameli, N.; Larosa, F.; Kothari, S.; Rickman, J.; Vercoulen, P.; Pollitt, H. The Momentum of the Solar Energy Transition. Nat. Commun. 2023, 14, 6542. [Google Scholar] [CrossRef] [PubMed]
Mellit, A.; Massi Pavan, A.; Ogliari, E.; Leva, S.; Lughi, V. Advanced Methods for Photovoltaic Output Power Forecasting: A Review. Appl. Sci. 2020, 10, 487. [Google Scholar] [CrossRef]
Rahimi, N.; Park, S.; Choi, W.; Oh, B.; Kim, S.; Cho, Y.; Ahn, S.; Chong, C.; Kim, D.; Jin, C.; et al. A Comprehensive Review on Ensemble Solar Power Forecasting Algorithms. J. Electr. Eng. Technol. 2023, 18, 719–733. [Google Scholar] [CrossRef] [PubMed]
Ye, H.; Yang, B.; Han, Y.; Chen, N. State-Of-The-Art Solar Energy Forecasting Approaches: Critical Potentials and Challenges. Front. Energy Res. 2022, 10, 875790. [Google Scholar] [CrossRef]
Makade, R.G.; Jamil, B. Statistical Analysis of Sunshine Based Global Solar Radiation (GSR) Models for Tropical Wet and Dry Climatic Region in Nagpur, India: A Case Study. Renew. Sustain. Energy Rev. 2018, 87, 22–43. [Google Scholar] [CrossRef]
Liu, W.; Mao, Z. Short-Term Photovoltaic Power Forecasting with Feature Extraction and Attention Mechanisms. Renew. Energy 2024, 226, 120437. [Google Scholar] [CrossRef]
Lateko, A.A.H.; Yang, H.-T.; Huang, C.-M.; Aprillia, H.; Hsu, C.-Y.; Zhong, J.-L.; Phương, N.H. Stacking Ensemble Method with the RNN Meta-Learner for Short-Term PV Power Forecasting. Energies 2021, 14, 4733. [Google Scholar] [CrossRef]
Rajagukguk, R.A.; Ramadhan, R.A.A.; Lee, H.-J. A Review on Deep Learning Models for Forecasting Time Series Data of Solar Irradiance and Photovoltaic Power. Energies 2020, 13, 6623. [Google Scholar] [CrossRef]
Kumar Dhaked, D.; Narayanan, V.L.; Gopal, R.; Sharma, O.; Bhattarai, S.; Dwivedy, S.K. Exploring Deep Learning Methods for Solar Photovoltaic Power Output Forecasting: A Review. Renew. Energy Focus 2025, 53, 100682. [Google Scholar] [CrossRef]
Agga, A.; Abbou, A.; Labbadi, M.; Houm, Y.E.; Ou Ali, I.H. CNN-LSTM: An Efficient Hybrid Deep Learning Architecture for Predicting Short-Term Photovoltaic Power Production. Electr. Power Syst. Res. 2022, 208, 107908. [Google Scholar] [CrossRef]
Luo, X.; Zhang, D.; Zhu, X. Deep Learning Based Forecasting of Photovoltaic Power Generation by Incorporating Domain Knowledge. Energy 2021, 225, 120240. [Google Scholar] [CrossRef]
Paletta, Q.; Terrén-Serrano, G.; Nie, Y.; Li, B.; Bieker, J.; Zhang, W.; Dubus, L.; Dev, S.; Feng, C. Advances in Solar Forecasting: Computer Vision with Deep Learning. Adv. Appl. Energy 2023, 11, 100150. [Google Scholar] [CrossRef]
Nazir, A.; He, J.; Zhu, N.; Qureshi, S.S.; Qureshi, S.U.; Ullah, F.; Wajahat, A.; Pathan, M.S. A Deep Learning-Based Novel Hybrid CNN-LSTM Architecture for Efficient Detection of Threats in the IoT Ecosystem. Ain Shams Eng. J. 2024, 15, 102777. [Google Scholar] [CrossRef]
O’Donncha, F.; Hu, Y.; Palmes, P.; Burke, M.; Filgueira, R.; Grant, J. A Spatio-Temporal LSTM Model to Forecast across Multiple Temporal and Spatial Scales. Ecol. Inform. 2022, 69, 101687. [Google Scholar] [CrossRef]
Fu, Q.; Wang, C.; Han, X. A CNN-LSTM Network with Attention Approach for Learning Universal Sentence Representation in Embedded System. Microprocess. Microsyst. 2020, 74, 103051. [Google Scholar] [CrossRef]
Ren, Q.; Li, Y.; Liu, Y. Transformer-Enhanced Periodic Temporal Convolution Network for Long Short-Term Traffic Flow Forecasting. Expert Syst. Appl. 2023, 227, 120203. [Google Scholar] [CrossRef]
Wan, H.; Wang, J.; Gan, Q.; Quan, R.; Xia, Y.; Himmiche, S.; Chang, Y. Improving Forecasting Accuracy of Renewable Energy Generation via Periodicity-Aware Deep Learning Framework with Time2Vec. Green Energy Intell. Transp. 2026, 253, 100397. [Google Scholar] [CrossRef]
Wan, H.; Wang, J.; Gan, Q.; Xia, Y.; Chang, Y.; Yan, H. Addressing Intermittency in Medium-Term Photovoltaic and Wind Power Forecasting Using a Hybrid xLSTM-TCCNN Model with Numerical Weather Predictions. Renew. Energy 2025, 253, 123618. [Google Scholar] [CrossRef]
Chen, S.; Wan, H.; Peng, B.; Quan, R.; Chang, Y.; Derigent, W. Accurate Multi-Step Wind and Solar Power Forecasting Based on Multi-Scale Convolutional Kolmogorov-Arnold Network and Improved Lemming-Optimized Attention Fusion. Eng. Appl. Artif. Intell. 2026, 163, 112832. [Google Scholar] [CrossRef]
Wang, X.; Gao, X.; Li, B.; Shi, Y.; Lu, X.; Yao, Y.; Wang, D.; Xu, X.; Li, H. Photovoltaic Power Prediction Considering VMD-CNN-LSTM and Migration Learning Frameworks for Poor Data Areas. In Proceedings of the 2024 IEEE 4th International Conference on Power, Electronics and Computer Applications (ICPECA), New York, NY, USA, 26–28 January 2024; IEEE: New York, NY, USA, 2024; pp. 509–514. [Google Scholar]
Zhang, H.; Chu, P. Prediction Study Based on TCN-BiLSTM-SA Time Series Model. In Proceedings of the 2nd International Conference on Intelligent Design and Innovative Technology (ICIDIT 2023); Atlantis Press: Amsterdam, The Netherlands, 2023; pp. 192–197. [Google Scholar]
Xue, H.; Ma, J.; Zhang, J.; Jin, P.; Wu, J.; Du, F. Power Forecasting for Photovoltaic Microgrid Based on MultiScale CNN-LSTM Network Models. Energies 2024, 17, 3877. [Google Scholar] [CrossRef]
Wan, H.; Qiu, Z.; Wang, J.; Quan, R.; Chang, Y.; Derigent, W. Optimizing Renewable Energy Forecasting: A Hybrid Approach Integrating MSADBO, BiGRU, and TCN for PV/Wind Power Generation Prediction. J. Supercomput. 2025, 81, 1269. [Google Scholar] [CrossRef]
Quan, R.; Cheng, G.; Guan, X.; Zhang, G.; Quan, J. A HO-BiGRU-Transformer Based PEMFC Degradation Prediction Method under Different Current Conditions. Renew. Energy 2026, 256, 124132. [Google Scholar] [CrossRef]
Ilias, L.; Sarmas, E.; Marinakis, V.; Askounis, D.; Doukas, H. Unsupervised Domain Adaptation Methods for Photovoltaic Power Forecasting. Appl. Soft Comput. 2023, 149, 110979. [Google Scholar] [CrossRef]
Shi, H.; Wei, A.; Xu, X.; Zhu, Y.; Hu, H.; Tang, S. A CNN-LSTM Based Deep Learning Model with High Accuracy and Robustness for Carbon Price Forecasting: A Case of Shenzhen’s Carbon Market in China. J. Environ. Manag. 2024, 352, 120131. [Google Scholar] [CrossRef]
Agga, F.A.; Abbou, S.A.; Houm, Y.E.; Labbadi, M. Short-Term Load Forecasting Based on CNN and LSTM Deep Neural Networks. IFAC-Pap. 2022, 55, 777–781. [Google Scholar] [CrossRef]
Lu, X.-Q.; Tian, J.; Liao, Q.; Xu, Z.-W.; Gan, L. CNN-LSTM Based Incremental Attention Mechanism Enabled Phase-Space Reconstruction for Chaotic Time Series Prediction. J. Electron. Sci. Technol. 2024, 22, 100256. [Google Scholar] [CrossRef]
Pham, H.; Le, Q. Autodropout: Learning Dropout Patterns to Regularize Deep Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 9351–9359. [Google Scholar]
Soomro, S.; Pora, W. Effect of Drop-out Layers Inside an Long Short-Term Memory for Household Load Forecast Application. In Proceedings of the 2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Istanbul, Turkey, 8–10 June 2023; pp. 1–7. [Google Scholar]
Böök, H.; Lindfors, A.V. Site-Specific Adjustment of a NWP-Based Photovoltaic Production Forecast. Sol. Energy 2020, 211, 779–788. [Google Scholar] [CrossRef]
Jaini, S.N.B.; Lee, D.; Heng, C.W. CNN-LSTM Neural Network-Based Short-Term PV Power Generation Forecaster. In Proceedings of the 2024 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), New York, NY, USA, 26–28 August 2024; IEEE: New York, NY, USA, 2024; pp. 693–696. [Google Scholar]
Morales-Cervantes, A.; Lobato-Nostroza, O.; Chávez-Campos, G.M.; Chiaradia-Masselli, Y.M.; Lara-Hernández, R. Photo-Voltaic Panel Power Production Estimation with an Artificial Neural Network Using Environmental and Electrical Measurements. In Proceedings of the 2023 International Conference on Power Engineering (ICPE 2023), Basel, Switzerland, 15–17 June 2023. [Google Scholar]
Chakraborty, D.; Mondal, J.; Barua, H.B.; Bhattacharjee, A. Computational Solar Energy–Ensemble Learning Methods for Prediction of Solar Power Generation Based on Meteorological Parameters in Eastern India. Renew. Energy Focus 2023, 44, 277–294. [Google Scholar] [CrossRef]
Heo, S.-Y.; Kim, D.H.; Song, Y.M.; Lee, G.J. Determining the Effectiveness of Radiative Cooler-Integrated Solar Cells. Adv. Energy Mater. 2022, 12, 2103258. [Google Scholar] [CrossRef]
Nahed, Z.; Hatem, M.; Aissa, C. A Very Short-Term Photovoltaic Power Forecasting Model Using Linear Discriminant Analysis Method and Deep Learning Based on Multivariate Weather Datasets. Eng. Proc. 2023, 56, 1. [Google Scholar] [CrossRef]

Figure 1. Interannual evolution of solar radiation, air temperature and relative humidity at two stations. The black solid lines denote the hourly means, while the light blue shaded areas mark standard deviations of hourly values within one day.

Figure 2. Overall framework of the proposed meteorology-driven photovoltaic energy forecasting approach.

Figure 3. Schematic diagram of the ConvTempNet. The streamlined architecture progresses from feature processing to predictive output, with dropout integrated to prevent overfitting in prediction tasks.

Figure 4. Schematic diagram of the DilaTransNet. The streamlined architecture incorporates dilated convolutions for multiscale temporal feature extraction, temporal blocks for residual learning, layer normalization for internal covariate shift reduction, and dropout layers to mitigate overfitting.

Figure 5. Impact of dropout rate on evaluation metrics for ConvTempNet and DilaTranNet at two stations. The left y-axis displays RMSE and MAE, while the right y-axis shows R² and MAPE.

Figure 6. Training and validation loss curves for ConvTempNet and DilaTranNet across epochs at two stations.

Figure 7. Comparison of predicted PV energy versus true one output for ConvTempNet and DilaTransNet at two stations. The red diagonal line marks the perfect prediction.

Figure 8. Same as Figure 7, but for the three baseline models. The red diagonal lines mark the 1:1 reference.

Figure 9. Same as Figure 7, but for the ALA-optimized models. The red diagonal lines mark the 1:1 reference.

Figure 10. Same as Figure 7, but for the CPO-optimized models. The red diagonal lines mark the 1:1 reference.

Figure 11. Impact of excluding individual input variables on evaluation metrics for the two convolutional networks. ΔMetrics were calculated as the metrics after eliminating one input variable minus those under optimal settings.

Figure 12. Correlation coefficients between meteorological variables and photovoltaic energy output at two stations.

Table 1. Comparison of recent hybrid deep learning models for PV forecasting.

Study	Method	Key Features	Limitations	Proposed Models
Liu & Mao [7]	CNN-BiLSTM + Attention	Integrated attention to weight key temporal features	Relies heavily on complex attention mechanisms; computational cost increases significantly with sequence length	Uses streamlined CNN-LSTM for efficiency on shorter horizons
Wang et al. [21]	VMD-CNN-LSTM	Uses Variational Mode Decomposition (VMD) to handle volatility	VMD requires signal decomposition preprocessing, which introduces latency in real-time online forecasting.	Uses dilated convolutions to capture volatility without external decomposition steps
Zhang & Chu [22]	TCN-BiLSTM-SA	Combines TCN for sequence features and BiLSTM for state capture	Reported instability in training; complex hybrid architecture prone to overfitting on small datasets	Systematically tests dropout resilience (0.2–0.4) to solve stability issues
Xue et al. [23]	Multiscale CNN-LSTM	Spatio-temporal fusion for microgrids	Focuses primarily on spatial correlations; less effective at capturing long-range temporal dependencies (e.g., seasonal shifts)	Specifically designed with Transformers to capture long-range dependencies
Wan et al. [24]	MSADBO-AT-BiGRU-TCN	Integrates self-attention temporal convolution networks and BiGRU with dung beetle optimization to enhance feature extraction	The integration of meta-heuristic optimization and multiple neural modules increases model complexity and computational cost, which may limit real-time forecasting applications.	Adopts CNN-LSTM and TCN architectures to capture meteorological-driven temporal patterns
Quan et al. [25]	HO-BiGRU-Transformer	Combines BiGRU and Transformer architectures optimized by the Hippopotamus Optimization algorithm	Relies heavily on optimization procedures and complex attention mechanisms, leading to increased training cost and sensitivity to hyperparameter settings	Employs CNN-LSTM and TCN models to balance forecasting accuracy and computational efficiency

Table 2. Summary of meteorological conditions at Tartaruga and Zarco. The temporal coverage of the dataset spans from 1 August 2018 to 31 January 2021 with an hourly resolution.

Station	Temperature [°C]	Relative Humidity [%]	Solar Radiation [W/m²]
Tartaruga	0–40	22–98	0–1037
Zarco	0–40	22–98	0–1031

Table 3. Hyperparameter configurations in ConvTempNet and DilaTransNet.

Model	Hyperparameter	Value
ConvTempNet	Convolutional filter size	3
	Number of convolutional filters	64
	Number of LSTM layers	2
	Dropout rate	0.1
	Input sequence length	10
	Batch size	32
	Learning rate	0.001
DilaTransNet	Convolutional filter size	10
	Filter stride	1
	Dropout rate	0.1
	Number of attention heads	8
	Feed-forward network dimension	256
	Number of layers	4
	Input sequence length	10
	Batch size	32
	Learning rate	0.01

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shu, Q.; Wang, Q.; Cao, Y.; Li, B. Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management. Sustainability 2026, 18, 6194. https://doi.org/10.3390/su18126194

AMA Style

Shu Q, Wang Q, Cao Y, Li B. Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management. Sustainability. 2026; 18(12):6194. https://doi.org/10.3390/su18126194

Chicago/Turabian Style

Shu, Quanzhuo, Qingwang Wang, Yueqian Cao, and Binghao Li. 2026. "Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management" Sustainability 18, no. 12: 6194. https://doi.org/10.3390/su18126194

APA Style

Shu, Q., Wang, Q., Cao, Y., & Li, B. (2026). Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management. Sustainability, 18(12), 6194. https://doi.org/10.3390/su18126194

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Weather-Dependent Photovoltaic Energy Prediction via Hybrid Deep Learning Models for Sustainable Energy Management

Abstract

1. Introduction

2. Dataset

3. Methods

3.1. ConvTempNet

3.2. DilaTransNet

3.3. Evaluation Metrics

4. Results and Discussion

4.1. Ablation Test

4.2. Model Performance

4.3. Effects of Hyperparameter Optimization

4.4. Input Correlation and Sensitivity

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI