# Convolutional Autoencoder-Based Anomaly Detection for Photovoltaic Power Forecasting of Virtual Power Plants

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- We propose a preprocessing method along with a forecasting model for various PV sites that exhibit anomalous power generation. Unlike general PV forecasting, which assumes normal power generation or knowledge of the anomaly in the BTM situation, we proactively detect anomalous sites.
- For interpretable anomaly detection, we develop a model that combines convolutional autoencoder (CAE) and principal component analysis (PCA) to extract and analyze the features of solar power data with scree plot analysis. As a result, we can extract and utilize features that contain important information from solar power data as low-dimensional vectors.
- Our methodology is designed to be robust to real-world data. Leveraging the proposed anomaly detection above, we compare two types of VPPs: the VPP with only normal sites and the VPP with a random mixture of anomaly and normal sites. Based on this, we show that simple and efficient unsupervised learning to construct a VPP with only normal PV sites leads to better forecasting performance than the other case. We observe that the forecasting error of the normal VPP is 6% or less, which satisfies the condition for receiving full incentives in the renewable energy wholesale market run by Korea Power Exchange (KPX).

## 2. Proposed Methodologies

#### 2.1. Dataset

#### 2.2. Anomaly Detection

#### 2.2.1. Convolutional Autoencoder

#### 2.2.2. Principal Component Analysis

#### 2.2.3. K-Means Clustering

#### 2.3. Model Selection

#### 2.3.1. Model Selection of CAE

#### 2.3.2. Model Selection of PCA

#### 2.3.3. Model Selection of K-Means Clustering

#### 2.4. Forecasting Model

#### 2.4.1. Transformer Encoder

#### 2.4.2. LSTM

## 3. An Application of Anomaly Detection for VPP Power Forecasting

#### 3.1. Data Preprocessing

#### 3.1.1. PV Data

#### 3.1.2. Weather Data

#### 3.2. Forecasting Result

## 4. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

DL | Deep learning |

DNN | Deep neural network |

LSTM | Long short-term memory |

PV | Photovoltaic |

PCA | Principal component analysis |

VPP | Virtual power plant |

GHG | Greenhouse gas |

AI | Artificil intelligence |

MLP | Multi-layer perceptron |

ANN | Artificial neural network |

RNN | Recurrent neural network |

STCNN | Space–time convolutional neural network |

GNN | Graph neural network |

GRU | Gated recurrent unit |

ESS | Energy storage system |

CAE | Convolutional autoencoder |

KPX | Korea Power Exchange |

SDSP | Stacked daily solar profile |

PCs | Principal components |

ARMA | Autoregressive moving average |

MLR | Multiple linear regression |

SVM | Support vector machine |

## References

- Bouckaert, S.; Pales, A.F.; McGlade, C.; Remme, U.; Wanner, B.; Varro, L.; D’Ambrosio, D.; Spencer, T. Net Zero by 2050: A Roadmap for the Global Energy Sector; International Energy Agency: Paris, France, 2021. [Google Scholar]
- Höhne, N.; Gidden, M.J.; den Elzen, M.; Hans, F.; Fyson, C.; Geiges, A.; Jeffery, M.L.; Gonzales-Zuñiga, S.; Mooldijk, S.; Hare, W.; et al. Wave of net zero emission targets opens window to meeting the Paris Agreement. Nat. Clim. Chang.
**2021**, 11, 820–822. [Google Scholar] [CrossRef] - Government of the Republic of Korea. 2050 Carbon Neutral Strategy of the Republic of Korea: Towards a Sustainable and Green Society; Government of the Republic of Korea: Seoul, Republic of Korea, 2020; pp. 1–131. [Google Scholar]
- Maka, A.O.; Alabid, J.M. Solar energy technology and its roles in sustainable development. Clean Energy
**2022**, 6, 476–483. [Google Scholar] [CrossRef] - Dincer, I. Renewable energy and sustainable development: A crucial review. Renew. Sustain. Energy Rev.
**2000**, 4, 157–175. [Google Scholar] [CrossRef] - Ibrahim, M.; Alsheikh, A.; Awaysheh, F.M.; Alshehri, M.D. Machine learning schemes for anomaly detection in solar power plants. Energies
**2022**, 15, 1082. [Google Scholar] [CrossRef] - Huang, Y.; Lu, J.; Liu, C.; Xu, X.; Wang, W.; Zhou, X. Comparative study of power forecasting methods for PV stations. In Proceedings of the 2010 International Conference on Power System Technology, Hangzhou, China, 24–28 October 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1–6. [Google Scholar]
- Benjamin, M.A.; Rigby, R.A.; Stasinopoulos, D.M. Generalized autoregressive moving average models. J. Am. Stat. Assoc.
**2003**, 98, 214–223. [Google Scholar] [CrossRef] - Aiken, L.S.; West, S.G.; Pitts, S.C. Multiple linear regression. In Handbook of Psychology; Wiley: Hoboken, NJ, USA, 2003; pp. 481–507. [Google Scholar]
- Alam, A.M.; Razee, I.A.; Zunaed, M.; Al-Masood, N. Solar PV power forecasting using traditional methods and machine learning techniques. In Proceedings of the 2021 IEEE Kansas Power and Energy Conference (KPEC), Manhattan, KS, USA, 19–20 April 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–5. [Google Scholar]
- El-Sebaii, A.; Al-Ghamdi, A.; Al-Hazmi, F.; Faidah, A.S. Estimation of global solar radiation on horizontal surfaces in Jeddah, Saudi Arabia. Energy Policy
**2009**, 37, 3645–3649. [Google Scholar] [CrossRef] - Viorel, B. Modeling Solar Radiation at the Earth’s Surface: Recent Advances; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Hossain, M.R.; Oo, A.M.T.; Ali, A. The effectiveness of feature selection method in solar power prediction. J. Renew. Energy
**2013**, 2013, 952613. [Google Scholar] [CrossRef] [Green Version] - Chiteka, K.; Enweremadu, C. Prediction of global horizontal solar irradiance in Zimbabwe using artificial neural networks. J. Clean. Prod.
**2016**, 135, 701–711. [Google Scholar] [CrossRef] - Khatib, T.; Mohamed, A.; Sopian, K.; Mahmoud, M. Solar energy prediction for Malaysia using artificial neural networks. Int. J. Photoenergy
**2012**, 2012, 419504. [Google Scholar] [CrossRef] - Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom.
**2020**, 404, 132306. [Google Scholar] [CrossRef] [Green Version] - Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed] - Li, G.; Wang, H.; Zhang, S.; Xin, J.; Liu, H. Recurrent neural networks based photovoltaic power forecasting approach. Energies
**2019**, 12, 2538. [Google Scholar] [CrossRef] [Green Version] - Jeong, J.; Kim, H. Multi-site photovoltaic forecasting exploiting space-time convolutional neural network. Energies
**2019**, 12, 4490. [Google Scholar] [CrossRef] [Green Version] - Mishra, M.; Dash, P.B.; Nayak, J.; Naik, B.; Swain, S.K. Deep learning and wavelet transform integrated approach for short-term solar PV power prediction. Measurement
**2020**, 166, 108250. [Google Scholar] [CrossRef] - Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw.
**2008**, 20, 61–80. [Google Scholar] [CrossRef] [PubMed] [Green Version] - López Santos, M.; García-Santiago, X.; Echevarría Camarero, F.; Blázquez Gil, G.; Carrasco Ortega, P. Application of Temporal Fusion Transformer for Day-Ahead PV Power Forecasting. Energies
**2022**, 15, 5232. [Google Scholar] [CrossRef] - Song, K.; Jeong, J.; Moon, J.H.; Kwon, S.C.; Kim, H. DTTrans: PV Power Forecasting Using Delaunay Triangulation and TransGRU. Sensors
**2022**, 23, 144. [Google Scholar] [CrossRef] - Jeong, J.; Kim, H. DeepComp: Deep reinforcement learning based renewable energy error compensable forecasting. Appl. Energy
**2021**, 294, 116970. [Google Scholar] [CrossRef] - Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv
**2017**, arXiv:1707.06347. [Google Scholar] - Natarajan, K.; Bala, P.K.; Sampath, V. Fault detection of solar PV system using SVM and thermal image processing. Int. J. Renew. Energy Res.
**2020**, 10, 967–977. [Google Scholar] - Harrou, F.; Dairi, A.; Taghezouit, B.; Sun, Y. An unsupervised monitoring procedure for detecting anomalies in photovoltaic systems using a one-class support vector machine. Sol. Energy
**2019**, 179, 48–58. [Google Scholar] [CrossRef] - Zhang, L.; Yang, L.; Gu, C.; Li, D. Lstm-based short-term electrical load forecasting and anomaly correction. In Proceedings of the E3S Web of Conferences, Tokyo, Japan, 19–21 June 2020; EDP Sciences: Les Ulis, France, 2020; Volume 182, p. 01004. [Google Scholar]
- Zhang, Y. A Better Autoencoder for Image: Convolutional Autoencoder. In Proceedings of the ICONIP17-DCEC, Guangzhou, China, 14–18 October 2017; Available online: http://users.cecs.anu.edu.au/Tom.Gedeon/conf/ABCs2018/paper/ABCs2018_paper_58.pdf (accessed on 23 March 2017).
- Ding, C.; He, X.; Zha, H.; Simon, H. Adaptive dimension reduction for clustering high dimensional data. In Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi, Japan, 9–12 December 2002; pp. 147–154. [Google Scholar] [CrossRef] [Green Version]
- Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat.
**2010**, 2, 433–459. [Google Scholar] [CrossRef] - Likas, A.; Vlassis, N.; Verbeek, J.J. The global k-means clustering algorithm. Pattern Recognit.
**2003**, 36, 451–461. [Google Scholar] [CrossRef] [Green Version] - Cattell, R.B. The scree test for the number of factors. Multivar. Behav. Res.
**1966**, 1, 245–276. [Google Scholar] [CrossRef] - Hintze, J.L.; Nelson, R.D. Violin plots: A box plot-density trace synergism. Am. Stat.
**1998**, 52, 181–184. [Google Scholar] - Shahapure, K.R.; Nicholas, C. Cluster quality analysis using silhouette score. In Proceedings of the 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), Sydney, NSW, Australia, 6–9 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 747–748. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]

PV Site | NMAE ^{1} |
---|---|

S1 | 10.28% |

S2 | 7.04% |

S3 | 12.77% |

S4 | 6.85% |

S5 | 18.60% |

S6 | 14.03% |

S7 | 9.66% |

S8 | 6.68% |

^{1}Normalized mean absolute error.

Layer | Reconstruction Error | Anomaly Detection Score | |||||||
---|---|---|---|---|---|---|---|---|---|

Complexity | MSE | RMSE | MAE | Accuracy | Precision | Recall | F1 Score | SilhouetteScore | |

3, 2, 16 | 30,883 | 0.0274 | 0.1655 | 0.0968 | 99.48% | 99.23% | 99.94% | 99.58% | 0.94 |

3, 2, 32 | 60,467 | 0.0237 | 0.1540 | 0.0875 | 99.44% | 99.17% | 99.94% | 99.55% | 0.90 |

3, 2, 64 | 119,635 | 0.0249 | 0.1577 | 0.0885 | 98.97% | 98.72% | 99.61% | 99.16% | 0.97 |

3, 2, 128 | 237,971 | 0.0260 | 0.1614 | 0.0961 | 99.09% | 99.22% | 99.29% | 99.26% | 0.87 |

3, 2, 256 | 474,643 | 0.0294 | 0.1715 | 0.0992 | 99.17% | 98.91% | 99.74% | 99.32% | 0.92 |

3, 3, 16 | 70,142 | 0.0190 | 0.1377 | 0.0727 | 99.13% | 99.04% | 99.55% | 99.29% | 0.94 |

3, 3, 32 | 136,686 | 0.0152 | 0.1232 | 0.0642 | 99.28% | 99.23% | 99.61% | 99.42% | 0.92 |

3, 3, 64 | 269,774 | 0.0140 | 0.1184 | 0.0607 | 99.44% | 99.36% | 99.74% | 99.55% | 0.92 |

3, 3, 128 | 535,950 | 0.0120 | 0.1093 | 0.0549 | 99.52% | 99.61% | 99.61% | 99.61% | 0.94 |

3, 3, 256 | 1,068,302 | 0.0127 | 0.1125 | 0.0563 | 99.44% | 99.61% | 99.48% | 99.55% | 0.92 |

3, 4, 16 | 126,325 | 0.0159 | 0.1262 | 0.0651 | 99.21% | 99.16% | 99.55% | 99.35% | 0.86 |

3, 4, 32 | 244,613 | 0.0113 | 0.1061 | 0.0524 | 99.17% | 99.10% | 99.55% | 99.32% | 0.89 |

3, 4, 64 | 481,189 | 0.0139 | 0.1181 | 0.0595 | 99.21% | 99.29% | 99.42% | 99.35% | 0.73 |

3, 4, 128 | 954,341 | 0.0082 | 0.0907 | 0.0447 | 99.48% | 99.74% | 99.42% | 99.58% | 0.86 |

3, 4, 256 | 1,900,645 | 0.0074 | 0.0859 | 0.0404 | 99.44% | 99.55% | 99.55% | 99.55% | 0.87 |

Model | Silhouette Score | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|---|

Only K-means | 0.39 | 95.19% | 93.48% | 99.10% | 96.20% |

CAE + K-means | 0.78 | 96.62% | 95.18% | 99.35% | 97.31% |

CAE-PCA + K-means | 0.97 | 98.97% | 98.72% | 99.61% | 99.16% |

Model/K | Silhouette Score |
---|---|

CAE-PCA + K-means/K = 2 | 0.97 |

CAE-PCA + K-means/K = 3 | 0.94 |

CAE-PCA + K-means/K = 4 | 0.66 |

CAE-PCA + K-means/K = 5 | 0.51 |

Layer | Name | Dimension |
---|---|---|

0 | transformer encoder | 168 |

1 | LSTM decoder | 64 |

2 | FC layer 1 | 256 |

3 | FC layer 2 | 256 |

4 | output FC layer | 24 |

Hyperparameter | Value |
---|---|

Batch size | 16 |

Learning rate | 0.0001 |

Optimizer | Adam |

Epoch | 200 |

Loss function | MSE |

Region A before Anomaly Detection | Region A after Anomaly Detection | ||||
---|---|---|---|---|---|

VPP ID | Capacity (MW) | NMAE (%) | VPP ID | Capacity (MW) | NMAE (%) |

A1 | 17.17 | 5.6 | A16 | 21.14 | 4.66 |

A2 | 33.50 | 6.17 | A17 | 13.80 | 5.01 |

A3 | 14.80 | 5.91 | A18 | 18.37 | 4.5 |

A4 | 17.10 | 6.01 | A19 | 24.34 | 4.49 |

A5 | 12.60 | 5.96 | A20 | 19.87 | 4.51 |

A6 | 12.99 | 5.91 | A21 | 15.26 | 4.4 |

A7 | 24.51 | 6.46 | A22 | 15.84 | 4.47 |

A8 | 17.70 | 6.19 | A23 | 14.55 | 4.39 |

A9 | 17.95 | 6.16 | A24 | 22.32 | 5.17 |

A10 | 20.46 | 6.48 | A25 | 21.55 | 4.89 |

A11 | 19.95 | 6.35 | A26 | 14.33 | 5.12 |

A12 | 25.62 | 6.06 | A27 | 14.85 | 4.67 |

A13 | 10.11 | 6.38 | A28 | 14.67 | 4.32 |

A14 | 17.92 | 5.6 | A29 | 19.48 | 4.97 |

A15 | 20.84 | 6.5 | A30 | 25.98 | 4.9 |

Average | 18.88 | 6.12 | 18.42 | 4.70 | |

Improvement | 23.2% |

Region B before Anomaly Detection | Region B after Anomaly Detection | ||||
---|---|---|---|---|---|

VPP ID | Capacity (MW) | NMAE (%) | VPP ID | Capacity (MW) | NMAE (%) |

B1 | 27.34 | 6.19 | B8 | 25.94 | 4.81 |

B2 | 15.82 | 6.31 | B9 | 24.22 | 5.39 |

B3 | 29.77 | 6.26 | B10 | 24.10 | 4.66 |

B4 | 22.50 | 6.41 | B11 | 26.29 | 4.87 |

B5 | 21.37 | 6.52 | B12 | 19.69 | 5.02 |

B6 | 23.14 | 6.18 | B13 | 36.16 | 4.82 |

B7 | 25.47 | 6.4 | B14 | 19.52 | 4.96 |

Average | 23.63 | 6.32 | 25.13 | 4.93 | |

Improvement | 22.00% |

Region C before Anomaly Detection | Region C after Anomaly Detection | ||||
---|---|---|---|---|---|

VPP ID | Capacity (MW) | NMAE (%) | VPP ID | Capacity (MW) | NMAE (%) |

C1 | 17.85 | 5.85 | C8 | 32.96 | 5.11 |

C2 | 16.97 | 6.4 | C9 | 34.52 | 4.8 |

C3 | 25.38 | 6.46 | C10 | 42.34 | 5.1 |

C4 | 25.97 | 6.14 | C11 | 35.62 | 4.85 |

C5 | 20.84 | 6.4 | C12 | 26.00 | 5.26 |

C6 | 39.45 | 6.5 | C13 | 32.62 | 5.1 |

C7 | 30.69 | 6.82 | C14 | 38.70 | 5 |

Average | 25.31 | 6.37 | 34.68 | 5.03 | |

Improvement | 20.98% |

Region C before Anomaly Detection | Region C after Anomaly Detection | ||||
---|---|---|---|---|---|

VPP ID | Capacity (MW) | NMAE (%) | VPP ID | Capacity (MW) | NMAE (%) |

D1 | 12.16 | 6.23 | D6 | 15.44 | 4.5 |

D2 | 6.90 | 6.28 | D7 | 24.23 | 4.49 |

D3 | 29.26 | 6.34 | D8 | 19.68 | 5.28 |

D4 | 10.21 | 6.7 | D9 | 17.61 | 5.16 |

D5 | 8.78 | 6.45 | D10 | 37.53 | 5.03 |

Average | 13.46 | 6.40 | 22.90 | 4.89 | |

Improvement | 23.56% |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Park, T.; Song, K.; Jeong, J.; Kim, H.
Convolutional Autoencoder-Based Anomaly Detection for Photovoltaic Power Forecasting of Virtual Power Plants. *Energies* **2023**, *16*, 5293.
https://doi.org/10.3390/en16145293

**AMA Style**

Park T, Song K, Jeong J, Kim H.
Convolutional Autoencoder-Based Anomaly Detection for Photovoltaic Power Forecasting of Virtual Power Plants. *Energies*. 2023; 16(14):5293.
https://doi.org/10.3390/en16145293

**Chicago/Turabian Style**

Park, Taeseop, Keunju Song, Jaeik Jeong, and Hongseok Kim.
2023. "Convolutional Autoencoder-Based Anomaly Detection for Photovoltaic Power Forecasting of Virtual Power Plants" *Energies* 16, no. 14: 5293.
https://doi.org/10.3390/en16145293