Photovoltaic Power Generation Forecasting with Hidden Markov Model and Long Short-Term Memory in MISO and SISO Configurations
Abstract
:1. Introduction
1.1. Related Works
1.2. Contributions
- PV prediction is performed at short time horizons (five minutes ahead) with time steps of five minutes using the ambient weather dataset on Puerto Rico. The Caribbean location introduces challenges, as historically, weather predictions have been unreliable due to the high variability of winds and the complicated dynamics of the heat patterns throughout the day from both sea and land.
1.3. Outline
2. Proposed Workflow
2.1. Input Variables
2.2. Outlier Detection and Removal
2.3. Split Data
2.4. Training Data
2.5. Testing Data
2.6. Training Model
2.6.1. Long Short-Term Memory (LSTM)
2.6.2. Implementation
2.7. Metrics
3. Dataset Description
- DKA Solar [14]: This online hub provides a platform for sharing variable weather measurements obtained from PV farms located in Australia. The dataset used to evaluate the proposed models spans from 2013 to 2020, with 1,281,324 measurements taken every five minutes. The first 12,950 dataset values were not used because there are no active power values; however, the number of samples is high, with 1,268,374 measurements. In addition, DKA has 13 numeric features such as active power, wind speed, weather temperature, relative humidity, global horizontal irradiance, and others. The dataset was accessed on 22 June 2022.The first task involved in using the DKA dataset was to find the place and time selected by previous state-of-the-art papers; this was conducted in order to enable a comparison of the results of the proposed method. Next, the time spans were selected where all features were available because some features presented zero or empty information for some periods. Finally, step “Section 2.1” from the proposed workflow was executed.
- Ambient Weather [24]: This is a platform used to register, share, and download weather measurements. The “UPRM CID Sustainable Energy Center, Mayagüez” device was selected in Puerto Rico. The data used span from 20 February 2022 to 6 September 2022, with 56,340 measures taken every five minutes and 18 features.Ambient Weather is a community where owners of meteorological sensors can seamlessly share measurements from Puerto Rico in real time. For instance, the Sustainable Energy Center (SEC) laboratory operates two meteorological stations, contributing real-time data such as irradiance, temperature, humidity, and more. However, it is important to note that the Ambient Weather dataset currently only offers records dating back one year, resulting in a relatively short time span for analysis. Furthermore, instances of electrical interruption have occasionally led to gaps in the data from certain meteorological stations. To address this, the first task involved downloading data from various stations and selecting a time span with minimal gaps. These gaps were replaced using the ffill (forward fill) interpolation technique available in Python 3.10.7 [25]. Subsequently, step “Section 2.1” from the proposed workflow was executed.
4. Results
5. Discussion
- Experiment 1 presented good results using only one input, which reduced the model’s complexity and training time. Typically, the predicted values are only differ slightly from the real values; this error is smaller than 4 Watts in most cases.
- The trained models show good performance regarding signals with low outliers. This model can predict the common behavior of irradiance or power signals in PV systems. However, the proposed model does not produce good predictions regarding measures with sudden fluctuations, because the datasets do not have information on cloud movements to let the ML model anticipate a sudden fluctuation or outlier. Future work will include cloud movement information to train the prediction model to account for these abrupt variations.
- Experiment 3 showed the best results because it used more information than other experiments, but the model used here was more complex than in other experiments.
- The results in Table 4 show that adding more features to the ML model does not guarantee better results. For example, Experiment 3, with five features, produced better results with its fewer features than Experiment 2, which has eight features.
- Another example of the previous discussion can be found in Experiment 6 and Experiment 7. Both of these Experiments use the same subset of the dataset, but Experiment 7 has fewer features (only two) than Experiment 6 (five features). However, Experiment 7 achieved better results than Experiment 6. One explanation for this is that more features can introduce noise instead of relevant information; because of this, feature analysis, such as correlation analysis, is an essential part of building ML models.
- The LSTM method outperforms the traditional ML methods used in forecasting problems because LSTM can save and remove information. This was confirmed with Experiment 3, which applied the LSTM method and obtained better results than [7], where SVM was utilized.
- In accordance with what was expected, the outlier removal step improved the model’s performance. Comparing Experiment 1, where outliers were removed, to Experiment 5, in which outliers were kept, the results obtained in Experiment 1 were more accurate.
6. Conclusions
- The preprocessing step and specific feature selection used data correlation analysis. Horizontal irradiance and active power were the most correlated variables, with a correlation coefficient of 0.96. Therefore, horizontal irradiance and active power are the most important features for active power prediction.
- Applying HMM for outlier detection and elimination enables the classification of measures without the need for a predefined threshold setup. Outlier detection and elimination improved the results compared to the original signal. This is evident when comparing Experiment 1 to Experiment 5, where Experiment 1 used a signal without outliers as input, whereas Experiment 5 used the original signal as input. The results of Experiment 1 were better.
- The Puerto Rico dataset [24] has more outliers than the Australian dataset [14]. Because of this, the proposed ML model trained with the Australian dataset produced better results than the Puerto Rico dataset. Although this model does not consider outliers, we understand that cloud dynamics can cause dips identified as outliers. Future work will improve the ML model prediction by including outliers.
- The proposed ML method is an excellent tool for reducing the photovoltaic generation planning error implicit in medium- or long-term prediction by updating the generation planning at regular intervals. This enables an energy management system (EMS) to execute necessary actions such as battery charging or utilizing grid energy to maintain high-quality service. However, the proposed ML model does not capture outliers because it requires additional information about cloud movements, which is currently unavailable in the datasets used for this study.
- The outlier detection and elimination strategy using HMM can be used in preprocessing steps for weather datasets and other datasets with time series variables. In addition, the LSTM model can be used in short-term generation planning to complement the information used in EMS and enable proactive response to specific situations, for example, activating batteries when the irradiance decreases below the defined limit.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
Abbreviation | Meaning |
PV | Photovoltaic |
ML | Machine learning |
LSTM | Long short-term memory |
RCC-LSTM | Radiation classification LSTM |
MSE | Mean square error |
RMSE | Root mean square error |
MAE | Mean absolute error |
HMM | Hidden Markov model |
TWh | Terawatts by hour |
ARIMA | Autoregressive integrated moving average |
MLP | Multilayer perceptron |
SVM | Support cector machine |
KNN | K-nearest neighbor |
AR | Autoregressive |
ARX | Autoregressive exogenous input |
CNN | Convolutional neural network |
ConvLSTM | Convolutional LSTM |
ARMA | Autoregressive moving average |
FC-LSTM | Fully connected LSTM |
MAPE | Mean absolute percentage error |
ESN | Echo state network |
R2 | R squared |
RNN | Recurrent neural network |
SISO | Single input to obtain a single output |
MISO | Multiple inputs to obtain a single output |
DTR | Decision tree |
NaN | Not a number |
Pow | Active power |
Win_S | Wind speed |
Tem | Temperature |
Hum | Humidity |
Glo_H | Global horizontal irradiance |
Dif_H | Diffuse horizontal irradiance |
Wind_D | Wind direction |
Wea_D | Weather daily rainfall |
UPRM | University of Puerto Rico—Mayaguez |
CID | Investigation and development center |
EMS | Energy management system |
References
- Dhar, A.; Naeth, M.A.; Jennings, P.D.; El-Din, M.G. Perspectives on environmental impacts and a land reclamation strategy for solar and wind energy systems. Sci. Total Environ. 2020, 718, 134602. [Google Scholar] [CrossRef] [PubMed]
- Bett, A.; Burger, B.; Friedrich, L.; Kost, C.; Nold, S.; Peper, D.; Philipps, S.; Preu, R.; Rentsch, J.; Stryi-Hipp, G.; et al. Photovoltaics Report. February 2022. Available online: https://www.ise.fraunhofer.de/content/dam/ise/de/documents/publications/studies/Photovoltaics-Report.pdf (accessed on 30 March 2022).
- Hernández-Callejo, L.; Gallardo-Saavedra, S.; Alonso-Gómez, V. A review of photovoltaic systems: Design, operation and maintenance. Sol. Energy 2019, 188, 426–440. [Google Scholar] [CrossRef]
- Fouad, M.M.; Shihata, L.A.; Morgan, E.S.I. An integrated review of factors influencing the performance of photovoltaic panels. Renew. Sustain. Energy Rev. 2017, 80, 1499–1511. [Google Scholar] [CrossRef]
- Gupta, A.; Gupta, K.; Saroha, S. Solar irradiation forecasting technologies: A review. Strateg. Plan. Energy Environ. 2020, 39, 319–354. [Google Scholar] [CrossRef]
- Diagne, M.; David, M.; Lauret, P.; Boland, J.; Schmutz, N. Review of solar irradiance forecasting methods and a proposition for small-scale insular grids. Renew. Sustain. Energy Rev. 2013, 27, 65–76. [Google Scholar] [CrossRef]
- Pan, M.; Li, C.; Gao, R.; Huang, Y.; You, H.; Gu, T.; Qin, F. Photovoltaic power forecasting based on a support vector machine with improved ant colony optimization. J. Clean. Prod. 2020, 277, 123948. [Google Scholar] [CrossRef]
- Patarroyo-Montenegro, J.F.; Vasquez-Plaza, J.D.; Rodriguez-Martinez, O.F.; Garcia, Y.V.; Andrade, F. Comparative and cost analysis of a novel predictive power ramp rate control method: A case study in a pv power plant in puerto rico. Appl. Sci. 2021, 11, 5766. [Google Scholar] [CrossRef]
- Mas’ud, A.A. Comparison of three machine learning models for the prediction of hourly PV output power in Saudi Arabia. Ain Shams Eng. J. 2022, 13, 101648. [Google Scholar] [CrossRef]
- Bacher, P.; Madsen, H.; Nielsen, H.A. Online short-term solar power forecasting. Sol. Energy 2009, 83, 1772–1783. [Google Scholar] [CrossRef]
- Chai, S.; Xu, Z.; Jia, Y.; Wong, W.K. A Robust Spatiotemporal Forecasting Framework for Photovoltaic Generation. IEEE Trans. Smart Grid 2020, 11, 5370–5382. [Google Scholar] [CrossRef]
- Gomez, F.; Sa, N.; Schmidhuber, U.; Wierstra, D. Evolino: Hybrid Neuroevolution/Optimal Linear Search for Sequence Prediction Evolino: Hybrid Neuroevolution/Optimal Linear Search for Sequence Learning. 2005. Available online: https://www.researchgate.net/publication/248554235 (accessed on 13 March 2023).
- Chen, B.; Lin, P.; Lai, Y.; Cheng, S.; Chen, Z.; Wu, L. Very-short-term power prediction for PV power plants using a simple and effective RCC-LSTM model based on short term multivariate historical datasets. Electronics 2020, 9, 289. [Google Scholar] [CrossRef]
- DKA Solar Center. Available online: https://www.dkasolarcentre.com.au (accessed on 11 September 2022).
- Yadav, H.; Thakkar, A. NOA-LSTM: An efficient LSTM cell architecture for time series forecasting. Expert Syst. Appl. 2024, 238, 122333. [Google Scholar] [CrossRef]
- An, W.; Wang, L.; Zhang, D. Comprehensive commodity price forecasting framework using text mining methods. J. Forecast. 2023, 42, 1865–1888. [Google Scholar] [CrossRef]
- Khan, Z.A.; Hussain, T.; Haq, I.U.; Ullah, F.U.M.; Baik, S.W. Towards efficient and effective renewable energy prediction via deep learning. Energy Rep. 2022, 8, 10230–10243. [Google Scholar] [CrossRef]
- Bayrak, F.; Ertürk, G.; Oztop, H.F. Effects of partial shading on energy and exergy efficiencies for photovoltaic panels. J. Clean. Prod. 2017, 164, 58–69. [Google Scholar] [CrossRef]
- Singh, R.; Chen, Y. Learning Gaussian Hidden Markov Models from Aggregate Data. IEEE Control Syst. Lett. 2023, 7, 478–483. [Google Scholar] [CrossRef]
- Lee, J.; Cho, W.; Choi, J. Fault detection for IoT hydrogen refueling station system using a combined hidden Markov model mixed with Gaussian. In Proceedings of the International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2021, Mauritius, 7–8 October 2021; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2021. [Google Scholar] [CrossRef]
- Yao, T.; Wang, J.; Wu, H.; Zhang, P.; Li, S.; Xu, K.; Liu, X.; Chi, X. Intra-Hour Photovoltaic Generation Forecasting Based on Multi-Source Data and Deep Learning Methods. IEEE Trans. Sustain. Energy 2022, 13, 607–618. [Google Scholar] [CrossRef]
- Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: Lstm cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: https://www.tensorflow.org/ (accessed on 25 January 2023).
- Ambient, L. Ambient Weather Network. Available online: https://ambientweather.net/ (accessed on 9 February 2023).
- Manu, J. Modern Time Series Forecasting with Python Master Industry-Ready Time Series Forecasting Using Modern Machine Learning and Deep Learning; Packt Publishing Ltd.: Birmingham, UK, 2022. [Google Scholar]
- Rivera, A.A.I.; Colucci-Ríos, J.A.; O’Neill-Carrillo, E. Achievable Renewable Energy Targets for Puerto Rico’s Renewable Energy Portfolio Standard. 2009. Available online: https://bibliotecalegalambiental.files.wordpress.com/2013/12/achievable-renewable-energy-targets-fo-p-r.pdf (accessed on 28 January 2023).
Classes | Input Variables |
---|---|
Atmospheric characteristics | Pressure, temperature, cloud abundance, rainfall, cloud formation, cloud cover in the atmosphere, radiation, humidity, density, wind energy, wind speed, wind direction, evaporation, sunshine duration, wind gust, average temperature, ambient temperature, minimum temperature, maximum temperature, sky information, temperature variation. |
Solar characteristics | Solar energy, solar irradiance, zenith angle, global horizontal irradiance, diffuse horizontal irradiance, direct normal irradiance, global solar radiation, daily solar radiation, cell temperature, wavelength, precipitation, photovoltaic energy. |
Geographic conditions | latitude, longitude, altitude. |
Related Work | Advantages | Disadvantages |
---|---|---|
I-ACO-SVM [7] | SVM is a classical machine learning technique. A workflow was proposed where the radial basis function kernel parameters are fine-tuned using optimal parameters obtained from the ant colony algorithm. This approach yields better accuracies and can be implemented in an embedded system due to its computational efficiency [7]. | The hyper-tuning of the parameters requires initial computational efforts due to the application of a search grid. |
KNN [9] | KNN is a suitable method for conducting load profile forecasting because it is highly adaptable for analyzing the K-nearest neighbors. | The limitations of KNN methods arise from their requirement for a substantial amount of data to accurately perform similarity measurements for identifying the k-nearest neighbors. In contrast, deep learning approaches have demonstrated superior performance in terms of MSE, MAPE, and R2. |
ConvLSTM [11] | The performance achieved by the convolutional LSTM outperforms the accuracy obtained for the baseline algorithms, including classical machine learning techniques. This model integrates spatial image analysis into the study of power prediction [11]. | Given the intricacy of the convolutional LSTM and the need for hyperparameter tuning via an extensive grid search, it demands a robust computational infrastructure. |
Evolino [12] | The Evolino framework typically avoids problems of vanishing gradients related to the RNN [12]. | In the framework that combines Evolino with LSTM, the training process is computationally expensive and may encounter overfitting issues owing to the large number of parameters that must be tuned [12]. |
RCC-LSTM [13] | The framework proposed in [13] outperforms the results of baseline algorithms such as RCC-RBFNN and RCC-BPNN. One of the major contributions is made during the preprocessing stage, where similarity measurements are obtained using the window size. | The RCC-LSTM requires a selection of threshold values, and the adjustment of cell numbers is contingent upon specific weather conditions in accordance with a fixed window size. |
ESN-CNN [17] | The pipeline proposed in [17] comprises three stages. The first stage involves preprocessing, which includes the removal of data abnormalities, followed by data normalization and the initialization of parameters for the echo state network (ESN). The output of the final ESN stage serves as the input for the convolutional neural network (CNN). This pipeline demonstrates excellent performance in power prediction for the benchmark datasets. | Echo state networks are susceptible to overfitting, primarily due to their large number of processing units. Furthermore, the convolutional operations within this framework require matrix multiplication, escalating computational complexity. |
Symbol | Description |
---|---|
Magnitude difference between two consecutive points | |
Variable to represent some feature | |
Magnitude of feature in the time | |
Magnitude of feature in the time | |
Hidden state vector of HMM | |
Observation variable vector of HMM | |
Prior probability of HMM | |
Training data | |
Position of the measurement | |
Present measure, used as input in the training process | |
Future desired measures, used as a reference in the training process | |
Predicted measure. This is the output of the trained model | |
Visible state in time of LSTM cell | |
Hidden state in time of LSTM cell | |
Present measure used as input to LSTM cell | |
Forget state of LSTM cell | |
Input state of LSTM cell | |
Output state of LSTM cell | |
Predicted visible state of LSTM cell | |
Sigmoid function | |
Dot operator | |
and | Configuration parameter of LSTM cell |
Name | Dataset | Features | Outlier Delete | MSE (KW) | RMSE (KW) | MAE (KW) |
---|---|---|---|---|---|---|
Experiment 1 | DKA | Only active power | Yes | |||
Experiment 2 | DKA | 8 features | Yes | |||
Experiment 3 | DKA | 5 features as SVM [7] | Yes | |||
Experiment 4 | Ambient Weather | Only solar radiation | Yes | |||
Experiment 5 | DKA | Only active power | No | |||
Experiment 6 | DKA subset selected in SVM [7] | 5 features as SVM [7] | Yes | |||
Experiment 7 | DKA subset selected in SVM [7] | 2 features | Yes | |||
SVM [7] | DKA | 5 features | Yes | |||
RCC-LSTM [13] | DKA | - | - | |||
ESNCNN [17] | DKA | - | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Delgado, C.J.; Alfaro-Mejía, E.; Manian, V.; O’Neill-Carrillo, E.; Andrade, F. Photovoltaic Power Generation Forecasting with Hidden Markov Model and Long Short-Term Memory in MISO and SISO Configurations. Energies 2024, 17, 668. https://doi.org/10.3390/en17030668
Delgado CJ, Alfaro-Mejía E, Manian V, O’Neill-Carrillo E, Andrade F. Photovoltaic Power Generation Forecasting with Hidden Markov Model and Long Short-Term Memory in MISO and SISO Configurations. Energies. 2024; 17(3):668. https://doi.org/10.3390/en17030668
Chicago/Turabian StyleDelgado, Carlos J., Estefanía Alfaro-Mejía, Vidya Manian, Efrain O’Neill-Carrillo, and Fabio Andrade. 2024. "Photovoltaic Power Generation Forecasting with Hidden Markov Model and Long Short-Term Memory in MISO and SISO Configurations" Energies 17, no. 3: 668. https://doi.org/10.3390/en17030668
APA StyleDelgado, C. J., Alfaro-Mejía, E., Manian, V., O’Neill-Carrillo, E., & Andrade, F. (2024). Photovoltaic Power Generation Forecasting with Hidden Markov Model and Long Short-Term Memory in MISO and SISO Configurations. Energies, 17(3), 668. https://doi.org/10.3390/en17030668