# Short-Term Photovoltaic Power Forecasting Based on a Feature Rise-Dimensional Two-Layer Ensemble Learning Model

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

- A short-term PV power forecasting method using K-means clustering, EL, the FRD approach, and QR is proposed. This method can improve the deterministic forecasting accuracy of PV power and is suitable for uncertainty analysis.
- An FRD-TLEL model is proposed. Firstly, based on the XGBoost, RF, CatBoost, and LSTM model, combined with the EL framework pattern, a short-term PV power deterministic prediction model based on TLEL is constructed, avoiding the limitations of a single prediction model. Then, the training results of the XGBoost, RF, and CatBoost models are directly used as FRD data to optimize the forecasting dataset, enabling the model to carry more original data features while implementing EL. At the same time, the workload of training the original data is reduced, and the accuracy and efficiency of forecasting are improved. Finally, in the proposed FRD-TLEL model, the forecasting results of the TLEL model and the FRD models are weighted and combined using the reciprocal error method to obtain the final forecasting result, effectively improving prediction accuracy.
- The forecasting performance of the proposed FRD-TLEL model is compared with the BP, XGBoost, RF, CatBoost, LSTM, R-XGBL, R-RFL, R-CatBL, and TLEL models in different seasons and weather types, proving that the proposed FRD-TLEL model has a good improvement in deterministic forecasting accuracy. The good forecasting performance of the probability interval forecasting model based on QR considering the deterministic forecasting results is verified through forecasting results at different confidence levels.

## 2. Methodology

#### 2.1. Framework for Proposed Methods

- Data preprocessing. First, the raw PV power output data and meteorological data are cleaned. Next, the meteorological features are selected by calculating the Spearman correlation coefficient of the data. Then, the K-means clustering algorithm is employed to construct weather categories. After that, the data are normalized. Finally, the data are divided into training, validation, and test sets. Subsequently, the preprocessed dataset used for forecasting is obtained.
- Deterministic forecasting. Based on the XGBoost, RF, CatBoost, and LSTM models, the TLEL model is constructed. At the same time, the FRD approach is introduced to construct the R-XGBL, R-RFL, and R-CatBL models. The above models are combined using the reciprocal error method to construct the FRD-TLEL model, and then deterministic forecasting results are obtained.
- Probability interval forecasting. Based on the deterministic forecasting error dataset and preprocessed dataset, after selecting the loss function and quantile, the quantile forecasting results are obtained, and the forecasting interval is constructed to acquire the PV power interval forecasting results based on QR.

#### 2.2. Data Description

^{2}), ambient temperature (°C), relative humidity (%rh), wind speed (m/s), wind direction (°), daily average precipitation (mm), etc. We selected the PV power output, solar irradiance, ambient temperature, relative humidity, and wind speed data with a time step of 5 min from 0:00 to 11:55 every day, from the 1 April 2016 to the 31 March 2018, from the downloaded data as the raw data for this study. The three $\sigma $ criterion was utilized to detect outliers in the data; for large amounts of missing data within the forecasting period, all the data within that date were deleted, and, for cases of scattered missing data, the pre- and post-data filling interpolation method was used for processing [39]. In order to improve the forecasting accuracy and adapt to the forecasting model, by analyzing the non-zero range of daily effective PV power in different seasons, the daily forecasting time period from 7:00 to 18:00 was finally selected, and the time step of the data was adjusted from 5 min to 15 min. Thus, 31,725 sets of valid data containing 45 sample points per day were constructed for the correlation analysis and weather category construction.

#### 2.3. Correlation Analysis of Multiple Features

#### 2.4. Weather Category Construction

#### 2.5. Construction of the TLEL Model

#### 2.5.1. Ensemble Learning

#### 2.5.2. XGBoost Algorithm

_{k}is a function in the function space R; ${\widehat{y}}_{i}$ is the forecasting value of the regression tree; x

_{i}is the i-th data input, and R is the set of all possible regression tree models.

#### 2.5.3. RF Algorithm

- 1.
- Use the bootstrap method to repeatedly sample and randomly generate T training sets ${S}_{1},{S}_{2},\cdot \cdot \cdot ,{S}_{T}$.
- 2.
- Use each training set ${C}_{1},{C}_{2},\cdot \cdot \cdot ,{C}_{T}$ to generate a corresponding decision tree. Before selecting attributes on each non-leaf node, randomly select m attributes from the M attributes as the splitting attribute set for the current node, and split the node in the best splitting method among these m attributes.
- 3.
- Each tree grows well, so there is no need for pruning.
- 4.
- For the test set sample B, use each decision tree to test it and obtain the corresponding category ${C}_{1}\left(B\right),{C}_{2}\left(B\right),\cdot \cdot \cdot ,{C}_{T}\left(B\right)$.
- 5.
- Using the voting method, select the category with the highest output from T decision trees as the category to which sample B belongs.

#### 2.5.4. CatBoost Algorithm

#### 2.5.5. LSTM Algorithm

_{t}is the input information at time t; ${h}_{t}$ and ${h}_{t-1}$ are the output information at time t and t − 1, respectively; ${c}_{t}$ and ${c}_{t-1}$ represent the cell states at time t and t − 1; $W$ and $b$ are the weight coefficients and deviations for each gate, respectively; $\sigma $ and $\mathrm{tanh}$ represent the activation function $\mathrm{sigmoid}$ and hyperbolic tangent activation function, respectively, and f

_{t}, i

_{t}, and o

_{t}are the state operation results of the forget gate, input gate, and output gate, respectively.

_{t}and ${h}_{t-1}$ were processed using the $\mathrm{sigmoid}$ and $\mathrm{tanh}$ functions, respectively, to jointly determine what information was stored in the memory cell state. The forget gate determined the proportion of cell state information that needed to be retained in the current cell state at time t − 1. Moreover, the forget gate read the information of h

_{t}

_{−1}and x

_{t}, and, if f

_{t}was zero, all the information of c

_{t}

_{−1}was discarded, and, if f

_{t}was one, all the information of c

_{t}

_{−1}was retained. The output gate determined the degree to which the c

_{t}was saved to the cell output at time t.

#### 2.5.6. TLEL Model

#### 2.6. Construction of the Proposed FRD-TLEL Model

#### 2.6.1. Feature Engineering

#### 2.6.2. FRD-TLEL Model

#### 2.7. Probability Interval Forecasting Model Based on QR

#### 2.8. Performance Metrics

## 3. Forecasting Results and Discussion

#### 3.1. Analysis of Deterministic Forecasting Results

#### 3.2. Analysis of Probability Interval Forecasting Results

## 4. Conclusions

- The proposed FRD-TLEL model was effective in increasing the accuracy of forecasting as compared to the single models, providing three apparent advantages. Firstly, constructing the TLEL model through EL avoids the limitations of a single forecasting model. Secondly, by inputting a small number of influential features to train each single model, the training results containing the model features are obtained, thus forming new gain features for the further training of the models. This FRD approach can preserve the value of the original features in the dataset and achieve a deep fusion between various models to improve model forecasting accuracy. Furthermore, when large errors occur in individual models, the combination of forecasting models with the reciprocal error method can largely reduce the impact of said errors on the accuracy of the forecasting models.
- The results of the deterministic forecasting experiments indicate that the proposed FRD-TLEL model has the highest forecasting accuracy compared to the other models. Compared to the BP, XGBoost, RF, CatBoost, LSTM, R-XGBL, R-RFL, R-CatBL, and TLEL models, the FRD-TLEL model has the lowest MAPE and RMSE on sunny and cloudy or rainy days in spring, summer, autumn, and winter. On the sunny day in winter, the MAPE values of other models are 13.46%, 6.69%, 5.17%, 9.49%, 8.23%, 4.5%, 4.01%, 4.33%, and 2.02%, respectively, while the lowest MAPE of the FRD-TLEL model is 1.03%; the RMSE values of the other models are 8.78 kW, 4.64 kW, 4.75 kW, 3.12 kW, 3.53 kW, 2.31 kW, 2.46 kW, 4.34 kW, 1.54 kW, and 1.04 kW, respectively, while the RMSE of the FRD-TLEL model is lowest at 1.04 kW. Compared to models in some other studies, the FRD-TLEL model also has a higher forecasting accuracy.
- The experimental results of probability forecasting show that the FRD-TLEL model based on QR has a good forecasting performance for different seasons and weather conditions at the 95%, 75%, and 50% confidence levels.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Nomenclature

PV | Photovoltaic | XGBoost | eXtreme Gradient Boosting algorithm |

AI | Artificial intelligence | R-XGBL | FRD-XGBoost-LSTM |

BP | Back propagation | R-RFL | FRD-RF-LSTM |

SVM | Support vector machine | R-CatBL | FRD-CatBoost-LSTM |

ANN | Artificial neural network | QR | Quantile regression |

RF | Random forest | WOA | White optimization algorithm |

DL | Deep learning | LSSVM | Least squares support vector machine model |

CNN | Convolutional neural network | ELM | Extreme learning machine |

RNN | Recurrent neural network | GRU | Gated recursive unit |

LSTM | Long short-term memory | KDE | Kernel density estimation |

EL | Ensemble learning | MAPE | Mean absolute percentage error |

FRD | Feature rise-dimensional | RMSE | Root mean square error |

TLEL | Two-layer ensemble learning | PICP | Prediction interval coverage percentage |

FRD-TLEL | Feature rise-dimensional two-layer ensemble learning | PINAW | Prediction interval normalized average width |

$\rho $ | Spearman correlation coefficient | $b$ | Deviations for each gate |

n | Number of samples | $\sigma $ | Activation function $\mathrm{sigmoid}$ |

$R({a}_{i})$ | Positional value of certain meteorological factor | $\mathrm{tanh}$ | Hyperbolic tangent activation function |

$R({d}_{i})$ | Positional value of PV power | f_{t} | State operation results of the forget gate |

$\overline{R(a)}$ | Average positional value of certain meteorological factor | i_{t} | State operation results of the input gate |

$\overline{R(d)}$ | Average positional value of PV power | o_{t} | State operation results of the output gate |

A | Given dataset | ${\omega}_{i}$ | Weight coefficient of the model |

${A}_{i}$ | The i-th object | ${e}_{\mathrm{TLEL}}$ | Error value of TLEL model |

${A}_{it}$ | The t-th feature of the i-th object | ${e}_{\mathrm{R}-\mathrm{XGBL}}$ | Error value of R-XGBL model |

${C}_{j}$ | the j-th cluster center | ${e}_{\mathrm{R}-\mathrm{RFL}}$ | Error value of R-RFL model |

${C}_{jt}$ | The t-th feature of the j-th clustering center | ${e}_{\mathrm{R}-\mathrm{CatBL}}$ | Error value of R-CatBL model |

${S}_{j}$ | The j-th cluster | ${e}_{\mathrm{min}}$ | The maximum value of ${e}_{\mathrm{TLEL}}$, ${e}_{\mathrm{R}-\mathrm{XGBL}}$, ${e}_{\mathrm{R}-\mathrm{RFL}}$, and ${e}_{\mathrm{R}-\mathrm{CatBL}}$ |

$\left|{S}_{j}\right|$ | Number of objects in the j-th cluster | ${e}_{\mathrm{sec}}$ | The second maximum of ${e}_{\mathrm{TLEL}}$, ${e}_{\mathrm{R}-\mathrm{XGBL}}$, ${e}_{\mathrm{R}-\mathrm{RFL}}$, and ${e}_{\mathrm{R}-\mathrm{CatBL}}$ |

K | Number of the trees | ${e}_{\mathrm{thr}}$ | The third maximum of ${e}_{\mathrm{TLEL}}$, ${e}_{\mathrm{R}-\mathrm{XGBL}}$, ${e}_{\mathrm{R}-\mathrm{RFL}}$, and ${e}_{\mathrm{R}-\mathrm{CatBL}}$ |

f_{k} | A function in the function space R | ${e}_{\mathrm{max}}$ | The minimum value of ${e}_{\mathrm{TLEL}}$, ${e}_{\mathrm{R}-\mathrm{XGBL}}$, ${e}_{\mathrm{R}-\mathrm{RFL}}$, and ${e}_{\mathrm{R}-\mathrm{CatBL}}$ |

${\widehat{y}}_{i}$ | Forecasting value of the regression tree | ${f}_{1}$ | Forecasting result set corresponding to ${e}_{\mathrm{min}}$ |

x_{i} | The i-th data input | ${f}_{2}$ | Forecasting result set corresponding to ${e}_{\mathrm{thr}}$ |

R | Set of all possible regression tree models | ${f}_{3}$ | Forecasting result set corresponding to ${e}_{\mathrm{sec}}$ |

$l\left(y,\widehat{y}\right)$ | Difference between the forecasting value of the model and the actual value | ${f}_{4}$ | Forecasting result set corresponding to ${e}_{\mathrm{max}}$ |

$\Omega \left({f}_{k}\right)$ | Regular term of the scalar function | ${f}_{\mathrm{A}}$ | Actual PV power |

T | Number of leaf nodes | ${f}_{\mathrm{PV}}$ | Forecasting result |

$\gamma $ | Penalty function coefficient | X | A vector of P-dimensional covariates |

$\omega $ | Score of the leaf node | Y | Real value response variable |

$\lambda $ | Regularization penalty coefficient | ${g}_{\tau}\left(\xb7\right)$ | An unknown univariate connection function |

${x}_{k}^{i}$ | Feature of the i-th category of the k-th training sample | $\phi \left(\xb7\right)$ | Known function |

${\widehat{x}}_{k}^{i}$ | The average value of all ${x}_{k}^{i}$ | $\tau $ | Given quantile |

${y}_{i}$ | Label of the j-th sample | ${\beta}_{0,\tau}$ | Index coefficient |

I | Indicator function | ${\beta}_{0,\tau}^{T}$ | Transposition of ${\beta}_{0,\tau}$ |

p | Added prior term | $K\left(\xb7\right)$ | Kernel function |

$a$ | Weight coefficient | h | Bandwidth |

x_{t} | Input information at time t | ${\widehat{x}}_{i}$ | Forecasting value |

${h}_{t}$ | Output information at time t | ${{x}^{\prime}}_{i}$ | Actual value of PV power |

${c}_{t}$ | Cell states at time t | ${B}_{i}^{{}^{(\mu )}}$ | Boolean value |

$W$ | Weight coefficients for each gate | $\Delta {P}_{i}$ | Bandwidth of the i-th interval |

## References

- Liu, Z.-F.; Li, L.-L.; Liu, Y.-W.; Liu, J.-Q.; Li, H.-Y.; Shen, Q. Dynamic Economic Emission Dispatch Considering Renewable Energy Generation: A Novel Multi-Objective Optimization Approach. Energy
**2021**, 235, 121407. [Google Scholar] [CrossRef] - Soni, J.; Bhattacharjee, K. Multi-Objective Dynamic Economic Emission Dispatch Integration with Renewable Energy Sources and Plug-in Electrical Vehicle Using Equilibrium Optimizer. Environ. Dev. Sustain.
**2023**. [Google Scholar] [CrossRef] - Acharya, S.; Ganesan, S.; Kumar, D.V.; Subramanian, S. Optimization of Cost and Emission for Dynamic Load Dispatch Problem with Hybrid Renewable Energy Sources. Soft Comput.
**2023**, 27, 14969–15001. [Google Scholar] [CrossRef] - Zhang, J.; Liu, Z.; Chen, T. Interval Prediction of Ultra-Short-Term Photovoltaic Power Based on a Hybrid Model. Electr. Power Syst. Res.
**2023**, 216, 109035. [Google Scholar] [CrossRef] - Zhou, W.; Jiang, H.; Chang, J. Forecasting Renewable Energy Generation Based on a Novel Dynamic Accumulation Grey Seasonal Model. Sustainability
**2023**, 15, 12188. [Google Scholar] [CrossRef] - Das, U.K.; Tey, K.S.; Seyedmahmoudian, M.; Mekhilef, S.; Idris, M.Y.I.; Van Deventer, W.; Horan, B.; Stojcevski, A. Forecasting of Photovoltaic Power Generation and Model Optimization: A Review. Renew. Sustain. Energy Rev.
**2018**, 81, 912–928. [Google Scholar] [CrossRef] - Han, S.; Qiao, Y.; Yan, J.; Liu, Y.; Li, L.; Wang, Z. Mid-to-Long Term Wind and Photovoltaic Power Generation Prediction Based on Copula Function and Long Short Term Memory Network. Appl. Energy
**2019**, 239, 181–191. [Google Scholar] [CrossRef] - Tang, Y.; Yang, K.; Zhang, S.; Zhang, Z. Photovoltaic Power Forecasting: A Hybrid Deep Learning Model Incorporating Transfer Learning Strategy. Renew. Sustain. Energy Rev.
**2022**, 162, 112473. [Google Scholar] [CrossRef] - Niu, D.; Wang, K.; Sun, L.; Wu, J.; Xu, X. Short-Term Photovoltaic Power Generation Forecasting Based on Random Forest Feature Selection and CEEMD: A Case Study. Appl. Soft Comput.
**2020**, 93, 106389. [Google Scholar] [CrossRef] - Zhang, L.; He, Y.; Wu, H.; Yang, X.; Ding, M. Ultra-Short-Term Multi-Step Probability Interval Prediction of Photovoltaic Power: A Framework with Time-Series-Segment Feature Analysis. Sol. Energy
**2023**, 260, 71–82. [Google Scholar] [CrossRef] - Dai, Y.; Wang, Y.; Leng, M.; Yang, X.; Zhou, Q. LOWESS Smoothing and Random Forest Based GRU Model: A Short-Term Photovoltaic Power Generation Forecasting Method. Energy
**2022**, 256, 124661. [Google Scholar] [CrossRef] - Sobri, S.; Koohi-Kamali, S.; Rahim, N.A. Solar Photovoltaic Generation Forecasting Methods: A Review. Energy Convers. Manag.
**2018**, 156, 459–497. [Google Scholar] [CrossRef] - Mayer, M.J.; Gróf, G. Extensive Comparison of Physical Models for Photovoltaic Power Forecasting. Appl. Energy
**2021**, 283, 116239. [Google Scholar] [CrossRef] - Ahmed, R.; Sreeram, V.; Mishra, Y.; Arif, M.D. A Review and Evaluation of the State-of-the-Art in PV Solar Power Forecasting: Techniques and Optimization. Renew. Sustain. Energy Rev.
**2020**, 124, 109792. [Google Scholar] [CrossRef] - Wang, Z.; Wang, Y.; Cao, S.; Fan, S.; Zhang, Y.; Liu, Y. A Robust Spatial-Temporal Prediction Model for Photovoltaic Power Generation Based on Deep Learning. Comput. Electr. Eng.
**2023**, 110, 108784. [Google Scholar] [CrossRef] - Wang, H.; Wang, J.; Piao, Z. Photovoltaic Power Forecasting Based on Similar Time Considering Influence Factor of Bad Air Quality. Appl. Soft Comput.
**2021**, 102, 106957. [Google Scholar] - Pan, M.; Li, C.; Gao, R.; Huang, Y.; You, H.; Gu, T.; Qin, F. Photovoltaic Power Forecasting Based on a Support Vector Machine with Improved Ant Colony Optimization. J. Clean. Prod.
**2020**, 277, 123948. [Google Scholar] [CrossRef] - Scott, C.; Ahsan, M.; Albarbar, A. Machine Learning for Forecasting a Photovoltaic (PV) Generation System. Energy
**2023**, 278, 127807. [Google Scholar] [CrossRef] - Abdellatif, A.; Mubarak, H.; Ahmad, S.; Ahmed, T.; Shafiullah, G.M.; Hammoudeh, A.; Abdellatef, H.; Rahman, M.M.; Gheni, H.M. Forecasting Photovoltaic Power Generation with a Stacking Ensemble Model. Sustainability
**2022**, 14, 11083. [Google Scholar] [CrossRef] - Gao, H.; Qiu, S.; Fang, J.; Ma, N.; Wang, J.; Cheng, K.; Wang, H.; Zhu, Y.; Hu, D.; Liu, H.; et al. Short-Term Prediction of PV Power Based on Combined Modal Decomposition and NARX-LSTM-LightGBM. Sustainability
**2023**, 15, 8266. [Google Scholar] [CrossRef] - Zhen, H.; Niu, D.; Wang, K.; Shi, Y.; Ji, Z.; Xu, X. Photovoltaic Power Forecasting Based on GA Improved Bi-LSTM in Microgrid without Meteorological Information. Energy
**2021**, 231, 120908. [Google Scholar] [CrossRef] - Peng, T.; Fu, Y.; Wang, Y.; Xiong, J.; Suo, L.; Nazir, M.S.; Zhang, C. An Intelligent Hybrid Approach for Photovoltaic Power Forecasting Using Enhanced Chaos Game Optimization Algorithm and Locality Sensitive Hashing Based Informer Model. J. Build. Eng.
**2023**, 78, 107635. [Google Scholar] [CrossRef] - Banik, R.; Biswas, A. Improving Solar PV Prediction Performance with RF-CatBoost Ensemble: A Robust and Complementary Approach. Renew. Energy Focus
**2023**, 46, 207–221. [Google Scholar] [CrossRef] - Wang, F.; Xuan, Z.; Zhen, Z.; Li, K.; Wang, T.; Shi, M. A Day-Ahead PV Power Forecasting Method Based on LSTM-RNN Model and Time Correlation Modification under Partial Daily Pattern Prediction Framework. Energy Convers. Manag.
**2020**, 212, 112766. [Google Scholar] [CrossRef] - Guo, X.; Gao, Y.; Zheng, D.; Ning, Y.; Zhao, Q. Study on Short-Term Photovoltaic Power Prediction Model Based on the Stacking Ensemble Learning. Energy Rep.
**2020**, 6, 1424–1431. [Google Scholar] [CrossRef] - Wu, Y.-K.; Huang, C.-L.; Phan, Q.-T.; Li, Y.-Y. Completed Review of Various Solar Power Forecasting Techniques Considering Different Viewpoints. Energies
**2022**, 15, 3320. [Google Scholar] [CrossRef] - Talaat, M.; Said, T.; Essa, M.A.; Hatata, A.Y. Integrated MFFNN-MVO Approach for PV Solar Power Forecasting Considering Thermal Effects and Environmental Conditions. Int. J. Electr. Power Energy Syst.
**2022**, 135, 107570. [Google Scholar] [CrossRef] - Li, P.; Zhou, K.; Lu, X.; Yang, S. A Hybrid Deep Learning Model for Short-Term PV Power Forecasting. Appl. Energy
**2020**, 259, 114216. [Google Scholar] [CrossRef] - Liu, Y.; Liu, Y.; Cai, H.; Zhang, J. An Innovative Short-Term Multihorizon Photovoltaic Power Output Forecasting Method Based on Variational Mode Decomposition and a Capsule Convolutional Neural Network. Appl. Energy
**2023**, 343, 121139. [Google Scholar] [CrossRef] - Van Der Meer, D.W.; Widén, J.; Munkhammar, J. Review on Probabilistic Forecasting of Photovoltaic Power Production and Electricity Consumption. Renew. Sustain. Energy Rev.
**2018**, 81, 1484–1512. [Google Scholar] [CrossRef] - Liu, L.; Zhao, Y.; Chang, D.; Xie, J.; Ma, Z.; Sun, Q.; Yin, H.; Wennersten, R. Prediction of Short-Term PV Power Output and Uncertainty Analysis. Appl. Energy
**2018**, 228, 700–711. [Google Scholar] [CrossRef] - Li, K.; Wang, R.; Lei, H.; Zhang, T.; Liu, Y.; Zheng, X. Interval Prediction of Solar Power Using an Improved Bootstrap Method. Sol. Energy
**2018**, 159, 97–112. [Google Scholar] [CrossRef] - Mitrentsis, G.; Lens, H. An Interpretable Probabilistic Model for Short-Term Solar Power Forecasting Using Natural Gradient Boosting. Appl. Energy
**2022**, 309, 118473. [Google Scholar] [CrossRef] - Gu, B.; Shen, H.; Lei, X.; Hu, H.; Liu, X. Forecasting and Uncertainty Analysis of Day-Ahead Photovoltaic Power Using a Novel Forecasting Method. Appl. Energy
**2021**, 299, 117291. [Google Scholar] [CrossRef] - Long, H.; Zhang, C.; Geng, R.; Wu, Z.; Gu, W. A Combination Interval Prediction Model Based on Biased Convex Cost Function and Auto-Encoder in Solar Power Prediction. IEEE Trans. Sustain. Energy
**2021**, 12, 1561–1570. [Google Scholar] [CrossRef] - Pan, C.; Tan, J.; Feng, D. Prediction Intervals Estimation of Solar Generation Based on Gated Recurrent Unit and Kernel Density Estimation. Neurocomputing
**2021**, 453, 552–562. [Google Scholar] [CrossRef] - Huang, Q.; Wei, S. Improved Quantile Convolutional Neural Network with Two-Stage Training for Daily-Ahead Probabilistic Forecasting of Photovoltaic Power. Energy Convers. Manag.
**2020**, 220, 113085. [Google Scholar] [CrossRef] - dka Solar Center. 263.0kW, Total of All Sites. Available online: https://dkasolarcentre.com.au/source/alice-springs/yulara-total-of-all-yulara-sites-1 (accessed on 19 September 2023).
- Huang, C.; Yang, M. Memory Long and Short Term Time Series Network for Ultra-Short-Term Photovoltaic Power Forecasting. Energy
**2023**, 279, 127961. [Google Scholar] [CrossRef] - Liu, D.; Sun, K. Random Forest Solar Power Forecast Based on Classification Optimization. Energy
**2019**, 187, 115940. [Google Scholar] [CrossRef] - Zhen, Z.; Liu, J.; Zhang, Z.; Wang, F.; Chai, H.; Yu, Y.; Lu, X.; Wang, T.; Lin, Y. Deep Learning Based Surface Irradiance Mapping Model for Solar PV Power Forecasting Using Sky Image. IEEE Trans. Ind. Appl.
**2020**, 56, 3385–3396. [Google Scholar] [CrossRef] - Khan, W.; Walker, S.; Zeiler, W. Improved Solar Photovoltaic Energy Generation Forecast Using Deep Learning-Based Ensemble Stacking Approach. Energy
**2022**, 240, 122812. [Google Scholar] [CrossRef] - Zhou, B.; Chen, X.; Li, G.; Gu, P.; Huang, J.; Yang, B. XGBoost–SFS and Double Nested Stacking Ensemble Model for Photovoltaic Power Forecasting under Variable Weather Conditions. Sustainability
**2023**, 15, 13146. [Google Scholar] [CrossRef] - Prasad, R.; Ali, M.; Kwan, P.; Khan, H. Designing a Multi-Stage Multivariate Empirical Mode Decomposition Coupled with Ant Colony Optimization and Random Forest Model to Forecast Monthly Solar Radiation. Appl. Energy
**2019**, 236, 778–792. [Google Scholar] [CrossRef] - Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. arXiv
**2019**, arXiv:1706.09516. [Google Scholar] - Zhou, H.; Zhang, Y.; Yang, L.; Liu, Q.; Yan, K.; Du, Y. Short-Term Photovoltaic Power Forecasting Based on Long Short Term Memory Neural Network and Attention Mechanism. IEEE Access
**2019**, 7, 78063–78074. [Google Scholar] [CrossRef] - Ospina, J.; Newaz, A.; Faruque, M.O. Forecasting of PV Plant Output Using Hybrid Wavelet-based LSTM-DNN Structure Model. IET Renew. Power Gener.
**2019**, 13, 1087–1095. [Google Scholar] [CrossRef] - Pirhooshyaran, M.; Scheinberg, K.; Snyder, L.V. Feature Engineering and Forecasting via Derivative-Free Optimization and Ensemble of Sequence-to-Sequence Networks with Applications in Renewable Energy. Energy
**2020**, 196, 117136. [Google Scholar] [CrossRef] - Salcedo-Sanz, S.; Cornejo-Bueno, L.; Prieto, L.; Paredes, D.; García-Herrera, R. Feature Selection in Machine Learning Prediction Systems for Renewable Energy Applications. Renew. Sustain. Energy Rev.
**2018**, 90, 728–741. [Google Scholar] [CrossRef] - Hu, J.; Tang, J.; Lin, Y. A Novel Wind Power Probabilistic Forecasting Approach Based on Joint Quantile Regression and Multi-Objective Optimization. Renew. Energy
**2020**, 149, 141–164. [Google Scholar] [CrossRef] - Ray, B.; Shah, R.; Islam, M.R.; Islam, S. A New Data Driven Long-Term Solar Yield Analysis Model of Photovoltaic Power Plants. IEEE Access
**2020**, 8, 136223–136233. [Google Scholar] [CrossRef] - Yildiz, C.; Acikgoz, H.; Korkmaz, D.; Budak, U. An improved residual-based convolutional neural network for very short-term wind power forecasting. Energy Convers. Manag.
**2021**, 228, 113731. [Google Scholar] [CrossRef] - An, Y.; Dang, K.; Shi, X.; Jia, R.; Zhang, K.; Huang, Q. A Probabilistic Ensemble Prediction Method for PV Power in the Nonstationary Period. Energies
**2021**, 14, 859. [Google Scholar] [CrossRef]

**Figure 2.**Scatter plots of the PV power with solar irradiance, ambient temperature, relative humidity, and wind speed. (

**a**) Solar irradiance; (

**b**) ambient temperature; (

**c**) relative humidity; and (

**d**) wind speed.

**Figure 3.**Thermal diagram of the Spearman correlation coefficient between the PV power and various meteorological factors.

**Figure 6.**Comparison of forecasting results of various models on sunny days in different seasons. (

**a**) Spring; (

**b**) summer; (

**c**) autumn; and (

**d**) winter.

**Figure 7.**Comparison of forecasting results of various models on cloudy or rainy days in different seasons. (

**a**) Spring; (

**b**) summer; (

**c**) autumn; and (

**d**) winter.

**Figure 8.**The decrease in the performance metrics of the FRD-TLEL model compared to other models on sunny days in different seasons and with different weather types. (

**a**) Spring; (

**b**) summer; (

**c**) autumn; and (

**d**) winter.

**Figure 9.**The decrease in the performance metrics of the FRD-TLEL model compared to other models on cloudy or rainy days in different seasons and with different weather types. (

**a**) Spring; (

**b**) summer; (

**c**) autumn; and (

**d**) winter.

**Figure 10.**Forecasting intervals for different seasons and weather types at a 95% confidence level. (

**a**) Sunny day in spring; (

**b**) sunny day in summer; (

**c**) sunny day in autumn; (

**d**) sunny day in winter; (

**e**) cloudy or rainy day in spring; (

**f**) cloudy or rainy day in summer; (

**g**) cloudy or rainy day in autumn; and (

**h**) cloudy or rainy day in winter.

**Figure 11.**Forecasting intervals for different seasons and weather types at a 75% confidence level. (

**a**) Sunny day in spring; (

**b**) sunny day in summer; (

**c**) sunny day in autumn; (

**d**) sunny day in winter; (

**e**) cloudy or rainy day in spring; (

**f**) cloudy or rainy day in summer; (

**g**) cloudy or rainy day in autumn; and (

**h**) cloudy or rainy day in winter.

**Figure 12.**Forecasting intervals for different seasons and weather types at a 50% confidence level. (

**a**) Sunny day in spring; (

**b**) sunny day in summer; (

**c**) sunny day in autumn; (

**d**) sunny day in winter; (

**e**) cloudy or rainy day in spring; (

**f**) cloudy or rainy day in summer; (

**g**) cloudy or rainy day in autumn; and (

**h**) cloudy or rainy day in winter.

Feature | Cluster 1 | Cluster 2 | Cluster 3 |
---|---|---|---|

Maximum ambient temperature (°C) | 29.59 | 23.34 | 33.09 |

Minimum ambient temperature (°C) | 14.72 | 16.01 | 18.54 |

Average ambient temperature (°C) | 25.06 | 20.26 | 28.97 |

Maximum relative humidity (%rh) | 57.32 | 67.30 | 40.23 |

Minimum relative humidity (%rh) | 19.26 | 37.43 | 11.86 |

Average relative humidity (%rh) | 29.92 | 50.70 | 18.32 |

Maximum solar irradiance (W/m^{2}) | 1054.96 | 551.19 | 1159.27 |

Minimum solar irradiance (W/m^{2}) | 26.01 | 18.61 | 84.03 |

Average solar irradiance (W/m^{2}) | 543.14 | 191.09 | 736.22 |

Seasons | Sunny Days | Cloudy Days | Rainy Days |
---|---|---|---|

Spring | 88 | 71 | 20 |

Summer | 127 | 33 | 20 |

Autumn | 79 | 90 | 14 |

Winter | 62 | 86 | 15 |

Seasons | Metrics | BP | XGBoost | RF | CatBoost | LSTM | R-XGBL | R-RFL | R-CatBL | TLEL | FRD-TLEL |
---|---|---|---|---|---|---|---|---|---|---|---|

Spring | MAPE (%) | 10.93 | 4.17 | 2.08 | 2.24 | 2.68 | 2.77 | 1.99 | 2.02 | 1.92 | 1.17 |

RMSE (kW) | 10.27 | 3.56 | 2.41 | 2.58 | 3.27 | 2.98 | 2.11 | 2.42 | 2.37 | 1.37 | |

Summer | MAPE (%) | 12.68 | 5.34 | 4.72 | 6.16 | 9.33 | 4.79 | 3.31 | 5.55 | 3.75 | 2.55 |

RMSE (kW) | 10.95 | 7.6 | 4.97 | 7.94 | 8.19 | 5.68 | 3.2 | 4.93 | 4.01 | 2.66 | |

Autumn | MAPE (%) | 13.35 | 5.86 | 5.7 | 4.14 | 7.51 | 4.64 | 4.12 | 3.94 | 3.82 | 2.42 |

RMSE (kW) | 4.66 | 4.91 | 4.65 | 4.35 | 4.11 | 4.9 | 2.08 | 4.28 | 4.06 | 2.05 | |

Winter | MAPE (%) | 13.46 | 6.69 | 5.17 | 9.49 | 8.23 | 4.5 | 4.01 | 4.33 | 2.02 | 1.03 |

RMSE (kW) | 8.78 | 4.64 | 4.75 | 3.12 | 3.53 | 2.31 | 2.46 | 4.34 | 1.54 | 1.04 |

**Table 4.**The error comparison of different forecasting methods on cloudy or rainy days in different seasons.

Seasons | Metrics | BP | XGBoost | RF | CatBoost | LSTM | R-XGBL | R-RFL | R-CatBL | TLEL | FRD-TLEL |
---|---|---|---|---|---|---|---|---|---|---|---|

Spring | MAPE (%) | 12.18 | 7.63 | 7.29 | 7.02 | 7.66 | 7.14 | 7.15 | 6.99 | 6.95 | 4.54 |

RMSE (kW) | 11.25 | 7.72 | 6.98 | 6.39 | 8.4 | 6.2 | 5.9 | 5.95 | 5.36 | 4.4 | |

Summer | MAPE (%) | 9.11 | 5.95 | 5.91 | 6.11 | 8.03 | 5.28 | 4.24 | 4.92 | 3.69 | 2.89 |

RMSE (kW) | 7.10 | 6.3 | 5.57 | 5.69 | 6.67 | 4.99 | 4.59 | 3.53 | 3.25 | 3.22 | |

Autumn | MAPE (%) | 18.31 | 8.69 | 14.95 | 11.79 | 12.33 | 5.85 | 14.91 | 8.95 | 5.13 | 4.63 |

RMSE (kW) | 5.33 | 5.79 | 4.63 | 3.97 | 5.22 | 4.81 | 4.3 | 3.79 | 3.35 | 3.22 | |

Winter | MAPE (%) | 19.33 | 8.95 | 10.08 | 15.88 | 10.53 | 6.62 | 7.12 | 10.07 | 5.49 | 3.26 |

RMSE (kW) | 6.03 | 6.59 | 5.92 | 6.73 | 5.81 | 5.94 | 4.41 | 5.95 | 5.02 | 3.38 |

Metrics | BP | XGBoost | RF | CatBoost | LSTM | R-XGBL | R-RFL | R-CatBL | TLEL | FRD-TLEL |
---|---|---|---|---|---|---|---|---|---|---|

MAPE (%) | 13.66 | 6.31 | 6.72 | 7.80 | 9.87 | 5.43 | 5.80 | 5.76 | 4.10 | 2.81 |

RMSE (kW) | 8.40 | 5.94 | 4.78 | 4.54 | 5.98 | 5.03 | 3.85 | 4.03 | 3.81 | 2.88 |

**Table 6.**The average running time of different types of forecasting models (formatted in minutes.seconds.milliseconds).

Season | Weather Type | TLEL Model | FRD-TLEL Model |
---|---|---|---|

Spring | Sunny day | 1.14.95 | 4.50.15 |

Cloudy or rainy day | 1.38.28 | 5.25.31 | |

Summer | Sunny day | 1.28.85 | 6.29.67 |

Cloudy or rainy day | 1.21.06 | 3.26.35 | |

Autumn | Sunny day | 2.24.64 | 5.58.79 |

Cloudy or rainy day | 3.52.26 | 9.22.83 | |

Winter | Sunny day | 4.36.56 | 9.34.36 |

Cloudy or rainy day | 4.53.42 | 16.18.82 |

**Table 7.**Interval forecasting performance metrics for different seasons and weather types at different confidence levels.

Confidence Level | Weather Type | Season | PICP | PINAW (%) |
---|---|---|---|---|

95% | Sunny day | Spring | 1 | 0.15 |

Summer | 0.98 | 0.32 | ||

Autumn | 0.98 | 0.26 | ||

Winter | 0.98 | 0.23 | ||

Cloudy or rainy day | Spring | 0.93 | 0.37 | |

Summer | 0.96 | 0.39 | ||

Autumn | 0.96 | 0.34 | ||

Winter | 0.96 | 0.41 | ||

75% | Sunny day | Spring | 0.87 | 0.09 |

Summer | 0.91 | 0.19 | ||

Autumn | 0.89 | 0.15 | ||

Winter | 0.96 | 0.13 | ||

Cloudy or rainy day | Spring | 0.71 | 0.22 | |

Summer | 0.89 | 0.23 | ||

Autumn | 0.82 | 0.2 | ||

Winter | 0.89 | 0.24 | ||

50% | Sunny day | Spring | 0.56 | 0.05 |

Summer | 0.64 | 0.11 | ||

Autumn | 0.58 | 0.09 | ||

Winter | 0.93 | 0.08 | ||

Cloudy or rainy day | Spring | 0.52 | 0.13 | |

Summer | 0.78 | 0.13 | ||

Autumn | 0.71 | 0.12 | ||

Winter | 0.8 | 0.14 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, H.; Yan, S.; Ju, D.; Ma, N.; Fang, J.; Wang, S.; Li, H.; Zhang, T.; Xie, Y.; Wang, J.
Short-Term Photovoltaic Power Forecasting Based on a Feature Rise-Dimensional Two-Layer Ensemble Learning Model. *Sustainability* **2023**, *15*, 15594.
https://doi.org/10.3390/su152115594

**AMA Style**

Wang H, Yan S, Ju D, Ma N, Fang J, Wang S, Li H, Zhang T, Xie Y, Wang J.
Short-Term Photovoltaic Power Forecasting Based on a Feature Rise-Dimensional Two-Layer Ensemble Learning Model. *Sustainability*. 2023; 15(21):15594.
https://doi.org/10.3390/su152115594

**Chicago/Turabian Style**

Wang, Hui, Su Yan, Danyang Ju, Nan Ma, Jun Fang, Song Wang, Haijun Li, Tianyu Zhang, Yipeng Xie, and Jun Wang.
2023. "Short-Term Photovoltaic Power Forecasting Based on a Feature Rise-Dimensional Two-Layer Ensemble Learning Model" *Sustainability* 15, no. 21: 15594.
https://doi.org/10.3390/su152115594