Short-Term Photovoltaic Power Plant Output Forecasting Using Sky Images and Deep Learning

: With the steady increase in the use of renewable energy sources in the energy sector, new challenges arise, especially the unpredictability of these energy sources. This uncertainty complicates the management, planning, and development of energy systems. An effective solution to these challenges is short-term forecasting of the output of photovoltaic power plants. In this paper, a novel method for short-term production prediction was explored which involves continuous photography of the sky above the photovoltaic power plant. By analyzing a series of sky images, patterns can be identiﬁed to help predict future photovoltaic power generation. A hybrid model that integrates both a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) for short-term production forecasting was developed and tested. This model effectively detects spatial and temporal patterns from images and power output data, displaying considerable prediction accuracy. In particular, a 74% correlation was found between the model’s predictions and actual future production values, demonstrating the model’s efﬁciency. The results of this paper suggest that the hybrid CNN-LSTM model offers an improvement in prediction accuracy and practicality compared to traditional forecasting methods. This paper highlights the potential of Deep Learning in improving renewable energy practices, particularly in power prediction, contributing to the overall sustainability of power systems.


Introduction
Photovoltaic (PV) power plants are among the most widely used types of renewable energy power plants [1].Mainly because of their minimal impact on the environment, recent research has focused on improving solar cells in terms of efficiency, manufacturing cost, and durability.As a result, the presence of PV power plants in the power grid is increasing [2].
The unpredictable nature of power generation from renewable energy sources, including solar energy, results in voltage and frequency fluctuations within the power grid, posing challenges for system management and control [3].Changes in the availability of renewable energy sources can occur rapidly, leaving conventional power plants with insufficient time to adjust their output.The delayed response of these conventional energy sources disrupts the balance between electrical energy generated and consumer demand.During periods when this balance is not achieved, voltage and frequency may deviate from their nominal values, resulting in a degradation of electrical power quality [4,5].
To mitigate the negative impact of renewable energy sources on the power system, it is critical to predict changes in the availability of these energy sources with reasonable accuracy.The highly dynamic nature of weather conditions makes accurate long-term Energies 2023, 16, 5428 2 of 18 prediction of cloud cover at a given location difficult [6].One approach is short-term forecasting of the cloud cover at the observed location (10 to 15 min ahead within a radius of 2000 m).This narrow spatial and temporal focus allows for more accurate prediction of cloud cover [7].Implementing a reliable power generation forecasting system reduces the need for balancing power-reserve power needed to compensate for deviations of production by renewable sources from the agreed schedule.Consequently, improved forecasting accuracy reduces the costs associated with integrating renewable energy sources into the power system [8].Moreover, minimising production constraints due to high variability promotes better utilisation of existing systems, resulting in lower electricity prices for consumers within the power system in addition to the cost benefits of system integration [9].
Over the years, numerous methods have been developed to predict the production of photovoltaic (PV) power plants [10].These methods are tailored to encompass a wide range of spatial and temporal resolutions.They can be categorised based on the type of input data, data preprocessing techniques, frequency of input data collection, spatial resolution, and temporal and spatial scope [11].
The prediction of PV power plant production depends heavily on the analysis of local weather conditions.By examining the correlations between the output power and certain recorded variables, a trend can be identified that is helpful in forecasting future production.Direct and diffuse solar radiation plays a significant role in the electricity generation of PV systems.However, the correlation between PV production and other factors such as wind speed, temperature, time of day, and relative humidity is considerably less pronounced due to the variability of wind direction and speed at different altitudes and relatively stable temperatures within a short time frame [12].
Short-term forecasting is crucial because PV systems often exhibit significant variations in output power over brief periods of time [13].For example, before a plant is overshadowed by cloud cover, there is an increase in output due to the cloud edge effect (Figure 1).Under such conditions, the output power of a PV system can increase by up to 150% of its last stable value due to the combined effect of direct solar radiation and scattered radiation from nearby clouds [14].Accurate prediction of these phenomena allows the system to prepare in time for upcoming production fluctuations.
Energies 2023, 16, x FOR PEER REVIEW 2 of 18 prediction of cloud cover at a given location difficult [6].One approach is short-term forecasting of the cloud cover at the observed location (10 to 15 min ahead within a radius of 2000 m).This narrow spatial and temporal focus allows for more accurate prediction of cloud cover [7].Implementing a reliable power generation forecasting system reduces the need for balancing power-reserve power needed to compensate for deviations of production by renewable sources from the agreed schedule.Consequently, improved forecasting accuracy reduces the costs associated with integrating renewable energy sources into the power system [8].Moreover, minimising production constraints due to high variability promotes better utilisation of existing systems, resulting in lower electricity prices for consumers within the power system in addition to the cost benefits of system integration [9].Over the years, numerous methods have been developed to predict the production of photovoltaic (PV) power plants [10].These methods are tailored to encompass a wide range of spatial and temporal resolutions.They can be categorised based on the type of input data, data preprocessing techniques, frequency of input data collection, spatial resolution, and temporal and spatial scope [11].
The prediction of PV power plant production depends heavily on the analysis of local weather conditions.By examining the correlations between the output power and certain recorded variables, a trend can be identified that is helpful in forecasting future production.Direct and diffuse solar radiation plays a significant role in the electricity generation of PV systems.However, the correlation between PV production and other factors such as wind speed, temperature, time of day, and relative humidity is considerably less pronounced due to the variability of wind direction and speed at different altitudes and relatively stable temperatures within a short time frame [12].
Short-term forecasting is crucial because PV systems often exhibit significant variations in output power over brief periods of time [13].For example, before a plant is overshadowed by cloud cover, there is an increase in output due to the cloud edge effect (Figure 1).Under such conditions, the output power of a PV system can increase by up to 150% of its last stable value due to the combined effect of direct solar radiation and scattered radiation from nearby clouds [14].Accurate prediction of these phenomena allows the system to prepare in time for upcoming production fluctuations.In order to identify trends in cloud cover change, input data is required that includes the current and past cloud cover status over the observed PV system.Elements such as cloud cover, position, direction, and speed of cloud movement can be obtained by In order to identify trends in cloud cover change, input data is required that includes the current and past cloud cover status over the observed PV system.Elements such as cloud cover, position, direction, and speed of cloud movement can be obtained by analysing Energies 2023, 16, 5428 3 of 18 satellite and radar images of cloud cover [15].However, satellite imagery databases are poorly suited for short-term forecasting due to insufficient temporal and spatial resolution.For local, short-term forecasts, a database with high spatial and temporal resolution information on cloud cover over the facility is essential.A viable method for creating such a database is serial photography of the sky hemisphere [16].
Evaluation of the effects of individual variables such as temperature, dew point temperature, relative humidity, visibility, barometric pressure, wind speed, cloud cover, wind direction, and precipitation level at the facility site on output power reveals a low correlation between the individual variables and output power.However, a more significant correlation can be obtained by analysing the combined effects of multiple variables [17].Using a single input variable, namely an image of the sky, to estimate current and future output greatly simplifies the system, increasing its robustness, ease of implementation, and economic efficiency.While using multiple input variables in conjunction with the image could potentially result in higher correlation, this approach would also complicate data collection and processing, as well as the overall size of the system for short-term PV system production prediction.
In addition, there are several methods for processing images of the sky's hemisphere.Information about the motion vectors of clouds derived from a sequence of images can increase the accuracy of the PV production prediction system [18].Identifying the position of the cloud in a sequence of images and comparing its position in two consecutive images allows the cloud motion vectors to be determined (Figure 2).Taking into account the size of the cloud and its direction and speed of movement, it is possible to predict the possibility of future shading of the observed plant by the cloud and, consequently, the future PV production at this location [19].
Energies 2023, 16, x FOR PEER REVIEW 3 of 18 analysing satellite and radar images of cloud cover [15].However, satellite imagery databases are poorly suited for short-term forecasting due to insufficient temporal and spatial resolution.For local, short-term forecasts, a database with high spatial and temporal resolution information on cloud cover over the facility is essential.A viable method for creating such a database is serial photography of the sky hemisphere [16].
Evaluation of the effects of individual variables such as temperature, dew point temperature, relative humidity, visibility, barometric pressure, wind speed, cloud cover, wind direction, and precipitation level at the facility site on output power reveals a low correlation between the individual variables and output power.However, a more significant correlation can be obtained by analysing the combined effects of multiple variables [17].Using a single input variable, namely an image of the sky, to estimate current and future output greatly simplifies the system, increasing its robustness, ease of implementation, and economic efficiency.While using multiple input variables in conjunction with the image could potentially result in higher correlation, this approach would also complicate data collection and processing, as well as the overall size of the system for short-term PV system production prediction.
In addition, there are several methods for processing images of the sky's hemisphere.Information about the motion vectors of clouds derived from a sequence of images can increase the accuracy of the PV production prediction system [18].Identifying the position of the cloud in a sequence of images and comparing its position in two consecutive images allows the cloud motion vectors to be determined (Figure 2).Taking into account the size of the cloud and its direction and speed of movement, it is possible to predict the possibility of future shading of the observed plant by the cloud and, consequently, the future PV production at this location [19].Deep learning, a branch of machine learning, harnesses the power of multilayer artificial neural networks and enables computers to learn and make decisions autonomously.This strategy enables models to decipher complex patterns and features from large amounts of data to perform a variety of tasks such as image recognition, speech analysis, language translation, and more [20].
For short-term forecasting of PV power plant production, a convolutional neural network (CNN) is particularly advantageous because it can decode nonlinear relationships between the inputs and outputs of a model.Thanks to the network's ability to recognise patterns in photographs, these images can serve as input data [21].Therefore, a CNN can detect the correlation between sky images and PV system power output.By using LSTM (Long Short Term Memory) to learn from time series data (data that requires long-term memory to determine the trend in PV power plant output changes), a model can be Deep learning, a branch of machine learning, harnesses the power of multilayer artificial neural networks and enables computers to learn and make decisions autonomously.This strategy enables models to decipher complex patterns and features from large amounts of data to perform a variety of tasks such as image recognition, speech analysis, language translation, and more [20].
For short-term forecasting of PV power plant production, a convolutional neural network (CNN) is particularly advantageous because it can decode nonlinear relationships between the inputs and outputs of a model.Thanks to the network's ability to recognise patterns in photographs, these images can serve as input data [21].Therefore, a CNN can detect the correlation between sky images and PV system power output.By using LSTM (Long Short Term Memory) to learn from time series data (data that requires longterm memory to determine the trend in PV power plant output changes), a model can Energies 2023, 16, 5428 4 of 18 be formulated for short-term prediction of PV power plant output [22].By predicting future levels of cloud cover based on trends in cloud movement and changes in lighting conditions, this model can accurately predict future PV power plant production.
For successful training of the neural network, a comprehensive, high-quality database is critical.Reviewing the available databases [23], it is clear that there is a lack of comprehensive databases of high-resolution sky images with solar irradiance data.The low temporal resolution of the collected photos and solar irradiance data is not sufficient to capture the significant changes in output power induced by cloud shading.From Figure 3, it can be seen that when data on the production of a photovoltaic power plant is collected with a lower time resolution, a significant amount of information regarding changes in cloud cover and associated changes in output power is lost.
Energies 2023, 16, x FOR PEER REVIEW 4 of 18 formulated for short-term prediction of PV power plant output [22].By predicting future levels of cloud cover based on trends in cloud movement and changes in lighting conditions, this model can accurately predict future PV power plant production.
For successful training of the neural network, a comprehensive, high-quality database is critical.Reviewing the available databases [23], it is clear that there is a lack of comprehensive databases of high-resolution sky images with solar irradiance data.The low temporal resolution of the collected photos and solar irradiance data is not sufficient to capture the significant changes in output power induced by cloud shading.From Figure 3, it can be seen that when data on the production of a photovoltaic power plant is collected with a lower time resolution, a significant amount of information regarding changes in cloud cover and associated changes in output power is lost.A database containing actual PV power plant output data should have a relatively high temporal resolution to capture the changes in PV power output.Given the highly dynamic nature of weather conditions at power plant sites due to high-velocity clouds with relatively small volumes, databases with high temporal resolution are needed.
In summary, the use of a convolutional neural network in conjunction with an LSTM is most promising for the short-term prediction of PV power plant production.The accuracy of a model built with a neural network depends on the type of input data used, which emphasises the need for a comprehensive and detailed database.
The main contributions of this paper include (1) the development of a CNN-LSTM neural network that uses low-resolution images of the sky and information about the output of photovoltaic power plants as inputs.This model is designed to forecast the power A database containing actual PV power plant output data should have a relatively high temporal resolution to capture the changes in PV power output.Given the highly dynamic nature of weather conditions at power plant sites due to high-velocity clouds with relatively small volumes, databases with high temporal resolution are needed.
In summary, the use of a convolutional neural network in conjunction with an LSTM is most promising for the short-term prediction of PV power plant production.The accuracy of a model built with a neural network depends on the type of input data used, which emphasises the need for a comprehensive and detailed database.
The main contributions of this paper include (1) the development of a CNN-LSTM neural network that uses low-resolution images of the sky and information about the output of photovoltaic power plants as inputs.This model is designed to forecast the power plant output 15 min in advance; (2) conducting an analysis of the impact of the prediction horizon on the accuracy of the proposed model.The hypothesis is that increasing the forecast horizon may increase the forecast error due to the inherent unpredictability of atmospheric conditions and the sporadic nature of cloud formation and dissipation; (3) investigating the effect of input data length on forecast accuracy and model training time.The hypothesis is that increasing the length of the input photo sequence could improve the accuracy of the model.However, this also proportionally increases the training time, thereby introducing a trade-off that must be considered in model building.
The remainder of this paper is organised as follows: Section 1 provides an introduction to the topic and reviews previous approaches to the problem.Section 2 describes the data used to train the neural network.Section 3 describes the CNN-LSTM hybrid method used in this paper.Section 4 presents the results of the network as well as analyses examining the impact of the forecast horizon and the length of the input photo sequence.Section 5 discusses the results.Finally, Section 6 offers a conclusion and suggests possible improvements for future iterations of the method.

Training Data
The database used to train the model for short-term prediction of photovoltaic power plant production was obtained from the website of Stanford University, California [25].A comprehensive review of available databases revealed that this particular database is well suited for building a short-term prediction model due to its high-frequency sampling of hemispherical sky images and associated photovoltaic power plant output.In addition, the multi-year span of the data collection meets the essential criteria for neural network training, allowing the model to detect seasonal trends in photovoltaic power plant output.
To simplify the short-term forecasting system for photovoltaic power plants, the data used for model training is restricted to the absolute minimum of input parameters.This reduction in input parameters increases the robustness of the system and facilitates its implementation.For the system to be effectively implemented in a real photovoltaic power plant, only a simple data collection module is required, as opposed to a small meteorological station, which is the requirement of several other systems and is responsible for collecting weather-related data at the observation site.The proposed model uses a photograph of the hemisphere of the sky above the observed power plant, the current time (expressed in date, hour, minute, and second), and the prevailing output power of the observed photovoltaic power plant as input data.
A study of the available databases revealed that most are ill-suited for short-term forecasting due to their insufficient sampling frequency.Even the databases with satisfactory sampling frequencies are inadequate due to insufficient amounts of data.
A database that includes several years of information and allows the model to capture seasonal and daily patterns of electricity production is used to train the model [25].

Data Preprocessing
Data preprocessing is a critical step for successful model training.The use of highresolution images requires a significant amount of storage and computational power for data processing, necessitating a reduction in the size of the training photos.
The large database used to train the network, consisting of 100,250 photos, requires additional preprocessing of the images to reduce memory requirements.The first step in data preprocessing involves reducing image resolution to 64 × 64 pixels.This significant reduction reduces the memory requirements for image storage and allows for faster model training with less memory requirements.In addition, the colors in the images are condensed to a grayscale spectrum.This color reduction further reduces the memory required for image storage and streamlines the analysis of visual features used by the model for prediction.Preprocessing of a randomly selected photograph can be seen in Figure 4.Although reducing the image resolution to 64 × 64 pixels and transitioning to a grayscale color spectrum appears to result in considerable information loss, it does not noticeably  Prior to model training, the data undergoes preprocessing to make it suitable for learning.This phase includes normalising the data, eliminating missing values, and segmenting the data into training, validation, and test sets.Normalisation of the data improves the learning efficiency of the model in terms of feature detection, while data segmentation facilitates the evaluation of the model performance and generalisation ability across different data sets.In addition, the time series data from the database must be converted into data sequences that are suitable for analysis by an LSTM network.The LSTM network uses a data sequence, e.g., data from a 45-min time window preceding the observed moment, to predict the production of the photovoltaic power plant 15 min ahead.
The challenge of memory availability is exacerbated when considering that a new dataset, consisting of 32 individual datasets, is generated for each training epoch.Each of these datasets contains photos covering a 45-min period and the corresponding power plant output power information.Pre-processing the data by reducing image resolution and colour spectrum allows for more convenient and efficient model training, even with limited memory resources.

CNN-LSTM Model
This section focuses on the development and implementation of a hybrid model that incorporates Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM).This combined approach aims at predicting future outputs of a photovoltaic power plant.The choice of a CNN-LSTM network is based on its special capabilities that allow the effective extraction of spatial patterns from images and temporal features from data sequences.
CNN equips the model with computational vision capabilities.This enables the processing of visual data from sky hemisphere photographs, allowing identification and learning of key features within the images such as cloud formations, illumination variations, and overall shapes.Conversely, LSTM networks are specifically designed to process time series data and can identify and learn both long-term and short-term trends in the data.This includes daily and seasonal patterns in weather, and time series weather data that are critical to accurately predicting energy production.Prior to model training, the data undergoes preprocessing to make it suitable for learning.This phase includes normalising the data, eliminating missing values, and segmenting the data into training, validation, and test sets.Normalisation of the data improves the learning efficiency of the model in terms of feature detection, while data segmentation facilitates the evaluation of the model performance and generalisation ability across different data sets.In addition, the time series data from the database must be converted into data sequences that are suitable for analysis by an LSTM network.The LSTM network uses a data sequence, e.g., data from a 45-min time window preceding the observed moment, to predict the production of the photovoltaic power plant 15 min ahead.
The challenge of memory availability is exacerbated when considering that a new dataset, consisting of 32 individual datasets, is generated for each training epoch.Each of these datasets contains photos covering a 45-min period and the corresponding power plant output power information.Pre-processing the data by reducing image resolution and colour spectrum allows for more convenient and efficient model training, even with limited memory resources.

CNN-LSTM Model
This section focuses on the development and implementation of a hybrid model that incorporates Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM).This combined approach aims at predicting future outputs of a photovoltaic power plant.The choice of a CNN-LSTM network is based on its special capabilities that allow the effective extraction of spatial patterns from images and temporal features from data sequences.
CNN equips the model with computational vision capabilities.This enables the processing of visual data from sky hemisphere photographs, allowing identification and learning of key features within the images such as cloud formations, illumination variations, and overall shapes.Conversely, LSTM networks are specifically designed to process time series data and can identify and learn both long-term and short-term trends in the data.This includes daily and seasonal patterns in weather, and time series weather data that are critical to accurately predicting energy production.
By integrating these two types of networks, the CNN-LSTM model becomes a powerful tool for forecasting future photovoltaic power plant production.Its ability to simultaneously process spatial information from images and temporal data from data series leads to improved accuracy and robustness of predictions.

Convolutional Neural Networks
CNNs represent a class of Deep Learning models specifically tailored to process visual data such as images and videos [26].In the field of solar forecasting for photovoltaic systems, CNNs are used to process weather data in the form of images, including satellite imagery and radar data [27].
A typical CNN model is composed of several layers, each performing different operations.These include convolutional layers that are responsible for local feature detection, activation layers that create nonlinearity, pooling layers that are used to downsize the dimensionality of the data, and fully connected layers that perform classification or regression based on the extracted features [28].In training CNNs, the backpropagation technique and optimization via gradient descent are used with the goal of minimising the prediction error [29].The architecture of a standard CNN is illustrated in Figure 5.
Energies 2023, 16, x FOR PEER REVIEW 7 of 18 By integrating these two types of networks, the CNN-LSTM model becomes a powerful tool for forecasting future photovoltaic power plant production.Its ability to simultaneously process spatial information from images and temporal data from data series leads to improved accuracy and robustness of predictions.

Convolutional Neural Networks
CNNs represent a class of Deep Learning models specifically tailored to process visual data such as images and videos [26].In the field of solar forecasting for photovoltaic systems, CNNs are used to process weather data in the form of images, including satellite imagery and radar data [27].
A typical CNN model is composed of several layers, each performing different operations.These include convolutional layers that are responsible for local feature detection, activation layers that create nonlinearity, pooling layers that are used to downsize the dimensionality of the data, and fully connected layers that perform classification or regression based on the extracted features [28].In training CNNs, the backpropagation technique and optimization via gradient descent are used with the goal of minimising the prediction error [29].The architecture of a standard CNN is illustrated in Figure 5.

Long Short-Term Memory Networks
LSTM networks are a specialised type of recurrent neural network (RNN) designed explicitly to process sequential data and learn long-term dependencies [30].In the context of photovoltaic power plant production prediction, LSTMs are used to study meteorological time series data such as temperature, wind speed, and solar irradiance to learn the complicated dependencies that affect electricity production.
An LSTM model consists of memory cells, each containing three distinct types of gates: Input gates that control the inflow of information into the memory cell; Forget gates that determine which information to retain or discard; and Output gates controlling the outflow of information from the memory cell [30].The architecture of the LSTM network is depicted in Figure 6.With these gates, the LSTM is able to efficiently learn and retain long-term dependencies in sequential data.Thus, it provides a solution to the vanishing gradient problem that plagues conventional RNNs [31].
Training of LSTMs includes backpropagation through time and gradient descent optimization methods that aim to minimise forecast error [32].This methodology allows the model to comprehend complex temporal patterns and data interdependencies, which is essential for accurately predicting the power output of photovoltaic power plants.

Long Short-Term Memory Networks
LSTM networks are a specialised type of recurrent neural network (RNN) designed explicitly to process sequential data and learn long-term dependencies [30].In the context of photovoltaic power plant production prediction, LSTMs are used to study meteorological time series data such as temperature, wind speed, and solar irradiance to learn the complicated dependencies that affect electricity production.
An LSTM model consists of memory cells, each containing three distinct types of gates: Input gates that control the inflow of information into the memory cell; Forget gates that determine which information to retain or discard; and Output gates controlling the outflow of information from the memory cell [30].The architecture of the LSTM network is depicted in Figure 6.With these gates, the LSTM is able to efficiently learn and retain long-term dependencies in sequential data.Thus, it provides a solution to the vanishing gradient problem that plagues conventional RNNs [31].
Training of LSTMs includes backpropagation through time and gradient descent optimization methods that aim to minimise forecast error [32].This methodology allows the model to comprehend complex temporal patterns and data interdependencies, which is essential for accurately predicting the power output of photovoltaic power plants.

Hybrid Model
This paper presents a hybrid model consisting of a CNN and LSTM designed to forecast the power output of a photovoltaic power plant.This model processes a sequence of hemispheric sky images and corresponding data on the photovoltaic power plant's power production from the previous 45 min.Its goal is to predict the power production 15 min after the last data point.

Hybrid Model
This paper presents a hybrid model consisting of a CNN and LSTM designed to forecast the power output of a photovoltaic power plant.This model processes a sequence of hemispheric sky images and corresponding data on the photovoltaic power plant's power production from the previous 45 min.Its goal is to predict the power production 15 min after the last data point.
The hybrid model consists of three main components: a CNN model for image processing, an LSTM model for processing time-series data, and an additional input for power levels.The structure of the model can be seen in Figure 7.The hybrid model consists of three main components: a CNN model for image processing, an LSTM model for processing time-series data, and an additional input for power levels.The structure of the model can be seen in Figure 7.

Hybrid Model
This paper presents a hybrid model consisting of a CNN and LSTM designed to forecast the power output of a photovoltaic power plant.This model processes a sequence of hemispheric sky images and corresponding data on the photovoltaic power plant's power production from the previous 45 min.Its goal is to predict the power production 15 min after the last data point.
The hybrid model consists of three main components: a CNN model for image processing, an LSTM model for processing time-series data, and an additional input for power levels.The structure of the model can be seen in Figure 7.The CNN component for image processing was developed to identify features within a photo sequence that are related to power output.The model consists of the following layers:

•
A convolutional layer using 32 filters with a kernel size of (3,3) and utilising ReLU as an activation function; • A batch normalisation layer is incorporated to enhance the stability and overall performance of the model; • A MaxPooling layer with a pool size of (2,2) designed to reduce the dimensionality of the data and highlight the most significant features; • A dropout layer with a dropout rate of 0.3, intended to prevent overfitting of the model;

•
A Flatten layer to convert the multidimensional output to a one-dimensional format.
In the proposed approach, the LSTM model is used to account for the temporal dependencies in the input data.The model consists of the following layers:

•
A TimeDistributed layer that applies the previously created CNN model to each temporal segment of the input sequence.This allows the model to effectively capture temporal dependencies within the data;

•
An LSTM layer with a number of units determined by the variable 'lstm_units' (in this model set to 64), using 'tanh' for activation and 'sigmoid' for recurrent activation, allowing modelling of long-term dependencies; • A dropout layer with a dropout rate of 0.3 to prevent overfitting.
In addition, a power input segment allows the model to account for the current power production of the photovoltaic system.It includes an input layer with a shape of (n_input), where n_input is the number of input data points, in this model set to 45.The approach in this work was initially to use LSTM networks to process the power input data.This method was chosen considering the time series characteristics of the data.However, the performance of the model degraded noticeably during the test phase.The hypothesis was that the temporal dependencies within the power input data might not be as complex as those of the images.Alternatively, the patterns in the data might be better interpreted by simpler Dense layers that do not contain a temporal component.Therefore, in order to improve the performance of the model, the use of LSTM for processing power input data was ultimately discarded.
After merging the CNN and LSTM models, a concatenate layer is appended to merge the outputs of the LSTM and the additional power input.The final Dense layer with linear activation is used to generate the energy production forecast, which serves as the model's prediction target.
The combined CNN and LSTM model, developed in Python, leverages the capabilities of the popular deep learning libraries TensorFlow and Keras, allowing for ease of implementation and flexibility.For computational infrastructure, Google Colab was selected, which is an online platform that enables the execution of Jupyter notebooks on Google's remote servers.A key benefit of the Google Colab platform is the provision of powerful graphic processing units (GPU) that significantly speed up the training of the models.In particular, the NVIDIA Tesla V100 GPU is used, which is known for its superior performance in deep neural network training.
The adoption of this robust system for model training allowed us to achieve remarkable results in a relatively brief period.In particular, the comprehensive training of the model was completed in only 364 s.Such fast training enables rapid iterations and experiments with different model configurations, which is essential for refining and optimising effective and accurate deep learning models.
The model training employs the 'adam' optimizer with a mean square error (MSE) loss function and the mean absolute error (MAE) is used as the metric for performance.In total, the combined CNN and LSTM model has 7,889,582 trainable parameters, as shown in Figure 8.The extent of these parameters demonstrates the complexity of the model and its potential to learn complex patterns from the input data.This model, carefully constructed by integrating different layers and techniques, proves effective in accurately forecasting the power production of a photovoltaic power plant based on sequences of hemispherical sky images and associated power output data.
its potential to learn complex patterns from the input data.This model, carefully constructed by integrating different layers and techniques, proves effective in accurately forecasting the power production of a photovoltaic power plant based on sequences of hemispherical sky images and associated power output data.

Results
After the successful development and training of the CNN-LSTM model, an evaluation was carried out using a test data set.In this section, the experimental results and the performance of the model in forecasting the output of a photovoltaic power plant are presented.

Model Performance Evaluation Metrics
For a thorough evaluation of the Deep Learning model's capabilities in short-term forecasting of photovoltaic power output, a number of evaluation metrics are used: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the coefficient of determination (R 2 ).
Mean absolute error (MAE) is used to measure the predictive accuracy of the model.It is calculated as the average absolute difference between the actual and forecasted values [34].MAE provides a simple and direct indication of the accuracy of the model and its responsiveness to individual errors: where  is the predicted value of a forecast model,  is the target value, and N is the number of testing samples.
Root Mean Squared Error (RMSE) is used to quantify the average magnitude of prediction errors [35].Unlike MAE, the RMSE attaches more importance to larger errors by squaring the residuals, making it particularly valuable when highlighting larger errors:

Results
After the successful development and training of the CNN-LSTM model, an evaluation was carried out using a test data set.In this section, the experimental results and the performance of the model in forecasting the output of a photovoltaic power plant are presented.

Model Performance Evaluation Metrics
For a thorough evaluation of the Deep Learning model's capabilities in short-term forecasting of photovoltaic power output, a number of evaluation metrics are used: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the coefficient of determination (R 2 ).
Mean absolute error (MAE) is used to measure the predictive accuracy of the model.It is calculated as the average absolute difference between the actual and forecasted values [34].MAE provides a simple and direct indication of the accuracy of the model and its responsiveness to individual errors: where P f orecasted is the predicted value of a forecast model, P measured is the target value, and N is the number of testing samples.Root Mean Squared Error (RMSE) is used to quantify the average magnitude of prediction errors [35].Unlike MAE, the RMSE attaches more importance to larger errors by squaring the residuals, making it particularly valuable when highlighting larger errors: The coefficient of determination (R 2 ) is used to assess how well our model accounts for variability within the target data set [36].R 2 allows us to quantify what proportion of the variance in actual power plant production can be predicted by our model: store such a comprehensive database.In this paper, a focused analysis was conducted to evaluate the impact of different temporal resolutions of input data on the accuracy of the model.The model was specifically tested using input data with temporal resolutions of 1 min, 5 min, and 15 min.
Energies 2023, 16, x FOR PEER REVIEW 12 of 18 variables and further demonstrates the modelʹs ability to accurately forecast photovoltaic power output.

Impact of the Temporal Resolution of Input Data on the Model Accuracy
The temporal resolution of the input data is a critical factor affecting the accuracy of any predictive model.This is especially important in determining the frequency of sampling photos of the sky hemisphere at the power plant site.Temporal resolution not only affects the speed of the processing unit but also the amount of memory required to store such a comprehensive database.In this paper, a focused analysis was conducted to evaluate the impact of different temporal resolutions of input data on the accuracy of the model.The model was specifically tested using input data with temporal resolutions of 1 min, 5 min, and 15 min.
During the modification of the temporal resolution of the input data, all other parameters within the model were kept constant.This approach ensured that any variations in model accuracy could be directly attributed to changes in temporal resolution.Reducing the temporal resolution of the input data results in a reduction in the amount of training data, resulting in a substantially shorter training time compared to the original model.The training times for the network and other performance evaluation metrics are shown in Figure 11.variables and further demonstrates the modelʹs ability to accurately forecast photovoltaic power output.

Impact of the Temporal Resolution of Input Data on the Model Accuracy
The temporal resolution of the input data is a critical factor affecting the accuracy of any predictive model.This is especially important in determining the frequency of sampling photos of the sky hemisphere at the power plant site.Temporal resolution not only affects the speed of the processing unit but also the amount of memory required to store such a comprehensive database.In this paper, a focused analysis was conducted to evaluate the impact of different temporal resolutions of input data on the accuracy of the model.The model was specifically tested using input data with temporal resolutions of 1 min, 5 min, and 15 min.
During the modification of the temporal resolution of the input data, all other parameters within the model were kept constant.This approach ensured that any variations in model accuracy could be directly attributed to changes in temporal resolution.Reducing the temporal resolution of the input data results in a reduction in the amount of training data, resulting in a substantially shorter training time compared to the original model.The training times for the network and other performance evaluation metrics are shown in Figure 11.These results highlight the significant role that the temporal resolution of the input data plays in the performance of predictive models and provide valuable insights for optimizing the practical applications of the model.
Figure 12 shows the prediction results for a particular day from the test data set.The figure illustrates the impact of data resolution on capturing rapid changes in photovoltaic power plant output during primarily cloudy days, which could potentially account for lower model accuracy.
These results highlight the significant role that the temporal resolution of the input data plays in the performance of predictive models and provide valuable insights for optimizing the practical applications of the model.
Figure 12 shows the prediction results for a particular day from the test data set.The figure illustrates the impact of data resolution on capturing rapid changes in photovoltaic power plant output during primarily cloudy days, which could potentially account for lower model accuracy.

Assessment of Forecasting Horizons in Short-Term Solar Forecasting
The prediction horizon of a model is closely related to its practical application.This paper focuses primarily on short-term forecasting, with an emphasis on a forecast horizon of 15 min into the future.A thorough analysis was undertaken to assess the performance of the model for different forecast horizons and to determine if the model can be adapted for other tasks, such as long-term or ultra-short-term forecasting.
In testing the various forecast horizons, all model parameters were kept at a specific value except for the forecast horizon expressed in minutes.The models developed for this purpose correspond to forecast horizons of 1, 5, 10, 15, 20, 25, 30, and 35 min ahead.
The time required to train the model varies slightly with the forecast horizon.On average, training the models with the different forecast horizons took 364 s, or 6 min and 4 s.
Figure 13 illustrates the results of the model analysis with the training dataset and shows the impact of the forecast horizon on the accuracy of the model in the short-term prediction of photovoltaic power plant output.

Assessment of Forecasting Horizons in Short-Term Solar Forecasting
The prediction horizon of a model is closely related to its practical application.This paper focuses primarily on short-term forecasting, with an emphasis on a forecast horizon of 15 min into the future.A thorough analysis was undertaken to assess the performance of the model for different forecast horizons and to determine if the model can be adapted for other tasks, such as long-term or ultra-short-term forecasting.
In testing the various forecast horizons, all model parameters were kept at a specific value except for the forecast horizon expressed in minutes.The models developed for this purpose correspond to forecast horizons of 1, 5, 10, 15, 20, 25, 30, and 35 min ahead.
The time required to train the model varies slightly with the forecast horizon.On average, training the models with the different forecast horizons took 364 s, or 6 min and 4 s.
Figure 13 illustrates the results of the model analysis with the training dataset and shows the impact of the forecast horizon on the accuracy of the model in the short-term prediction of photovoltaic power plant output.
data plays in the performance of predictive models and provide valuable insights for optimizing the practical applications of the model.
Figure 12 shows the prediction results for a particular day from the test data set.The figure illustrates the impact of data resolution on capturing rapid changes in photovoltaic power plant output during primarily cloudy days, which could potentially account for lower model accuracy.

Assessment of Forecasting Horizons in Short-Term Solar Forecasting
The prediction horizon of a model is closely related to its practical application.This paper focuses primarily on short-term forecasting, with an emphasis on a forecast horizon of 15 min into the future.A thorough analysis was undertaken to assess the performance of the model for different forecast horizons and to determine if the model can be adapted for other tasks, such as long-term or ultra-short-term forecasting.
In testing the various forecast horizons, all model parameters were kept at a specific value except for the forecast horizon expressed in minutes.The models developed for this purpose correspond to forecast horizons of 1, 5, 10, 15, 20, 25, 30, and 35 min ahead.
The time required to train the model varies slightly with the forecast horizon.On average, training the models with the different forecast horizons took 364 s, or 6 min and 4 s.
Figure 13 illustrates the results of the model analysis with the training dataset and shows the impact of the forecast horizon on the accuracy of the model in the short-term prediction of photovoltaic power plant output.The results show how the accuracy of the model in the short-term prediction of photovoltaic power plant output varies with the prediction horizon.These findings expand our comprehension of how the forecast horizon affects the performance of the model, which is a key factor in its efficient deployment.Aside from the embedded power generation patterns, the model uses sequences of photographs taken just prior to generating new forecasts for short-term predictions.The model uses these photographs to detect changes in cloud cover immediately before the forecast.By merging these observed changes with existing production patterns, the model can predict future power production.
To quantify the effects of varying lengths of the input photo sequences required for prediction, the models were trained sequentially using different input sequence lengths.All other model parameters remained constant to minimise any effect these may have on the results.Parameters for all models examined in this analysis include the number of epochs: 5, batch size: 32, and the optimizer: adam.
In the analysis described in this section, the length of the input data was changed, using time intervals of 15, 30, 45, 60, 75, 90, 105, and 120 min.These values represent both the number of input photos and the corresponding time frame for the input data as shown in Figure 14.The length of the photo sequence input can increase the accuracy of the model.A longer photo sequence allows the model to have a more complete understanding of the cloud conditions prior to the forecasted production time.However, the length of the sequence also introduces certain disadvantages.For example, if the required photo sequence length for forecasting is set to two hours, it is impossible to forecast the first two hours of production in a day using this method, as illustrated in Figure 15.Therefore, the selection of the number of photos used by the model as input data must be carefully balanced to maintain the practicality of the model without compromising the desired accuracy of the prediction.Furthermore, as the number of input photos increases, the computational resource requirements increase, affecting not only the training phase of the model, but also the prediction.This might prolong the time required for predicting future production.It is important to emphasise that the training time of the model can also increase significantly when a larger number of photos form the input sequence of the model.For example, on the computer on which the model was trained, the computation time for a single prediction might be negligible.However, if a less sophisticated and cost-effective computer, such as a Raspberry Pi, is used within the prediction system, the time required for prediction can become a significant factor.

Discussion
The paper confirms the remarkable efficiency of the CNN-LSTM model in predicting the performance of photovoltaic power plants.This model exploits the ability to detect features in an image sequence and assimilate time-oriented patterns from the power output data.The integration of CNN and LSTM models enables competent extraction of spatial and temporal patterns from the input data, ensuring exceptional performance.Furthermore, as the number of input photos increases, the computational resource requirements increase, affecting not only the training phase of the model, but also the prediction.This might prolong the time required for predicting future production.It is important to emphasise that the training time of the model can also increase significantly when a larger number of photos form the input sequence of the model.For example, on the computer on which the model was trained, the computation time for a single prediction might be negligible.However, if a less sophisticated and cost-effective computer, such as a Raspberry Pi, is used within the prediction system, the time required for prediction can become a significant factor.

Discussion
The paper confirms the remarkable efficiency of the CNN-LSTM model in predicting the performance of photovoltaic power plants.This model exploits the ability to detect features in an image sequence and assimilate time-oriented patterns from the power output data.The integration of CNN and LSTM models enables competent extraction of spatial and temporal patterns from the input data, ensuring exceptional performance.
The metrics of the proposed hybrid model-mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2 )-show a strong correlation between actual and predicted generation, indicating high forecasting accuracy for photovoltaic power generation.
The fusion of CNN and LSTM in the photovoltaic power generation prediction model represents a significant advance in accuracy and practicality.This model features the use of low-resolution imagery and power plant data to produce accurate forecasts.The application of this model can be extended to different types of solar power plants and eliminates the need for costly pyranometers or complicated image acquisition systems, thereby reducing costs.However, this deep learning model requires an extensive training dataset, which could be a hurdle where data availability is limited.In addition, the model must be adjusted for each new site to ensure optimal performance, which means it cannot be applied to every site without prior adjustments and optimizations.
The section on the effects of the temporal resolution of the input data on model accuracy illustrates the critical role that the choice of temporal resolution plays in the development of predictive models for photovoltaic power generation.This research shows that finer resolution can effectively capture the rapid changes in power output during primarily cloudy days, improving the accuracy of the model.However, the tradeoff between the improved accuracy provided by high-resolution data and the increased computational effort required to achieve it is a critical issue to consider, especially in resource-limited scenarios.Reducing the temporal resolution could be a possible strategy to reduce the computational load, but it may compromise the model's ability to accurately track rapid changes in power output.These findings are key in understanding the relationship between data resolution and model performance and will help optimise future predictive models.
When evaluating the impact of different forecast horizons on forecast accuracy, it is found that extending the forecast horizon can increase forecast error.This could be due to the inherent unpredictability of atmospheric conditions, the sporadic nature of cloud formation and dissipation within short time intervals, and the uncertain velocity and direction of cloud motion at different altitudes.
Another aspect of the analysis is that the length of the input photo sequence has a noticeable effect on the accuracy of the model.The more photos added to the sequence, the more detailed the model becomes, allowing it to better represent real-world conditions.However, a model that includes a larger number of photos in its input sequence also presents certain challenges.In addition to improving accuracy, the time required to train the various models was also considered.As the number of input photos increases, the training time increases proportionally.Therefore, this tradeoff between accuracy, time, and computational effort should be considered during model construction.Overcoming these challenges posed by the length of the input photo sequence is an important step to further improve the performance of the model.Missing initial predictions at the beginning of the day due to a long input photo sequence can be filled by using data from the previous day.This ensures a seamless prediction from sunrise.As the computational cost increases with the size of the input data sets, implementing more efficient data processing strategies is critical.Reducing image resolution or prediction frequency can prove effective in reducing this computational load.

Conclusions
In the paper, a CNN-LSTM model has been developed and evaluated for predicting photovoltaic power plant output based on hemispherical sky photos and power output information.Despite the constraints of reduced image resolution and the use of a grayscale colour spectrum, the model has shown impressive performance in forecasting solar power generation.The combination of CNN and LSTM has enabled effective learning of spatial and temporal patterns from the input data, resulting in improved performance compared to individual models.
As for possible improvements, the model could benefit from utilising high-resolution imagery as input data, which would allow the inclusion of additional CNN layers capable of processing input images with greater detail.In addition, the model could be trained using a more densely populated data set, such as data collected at intervals shorter than one minute.
The model is extensively applicable in the real world of photovoltaic power plants and power systems.Accurate forecasts of photovoltaic power plant production can help system operators to plan and manage load and generation, reduce costs, and improve overall system efficiency.In addition, the model can help better integrate photovoltaic power plants into the power grid, reducing the dependency on expensive and non-environmentally friendly backup power plants.It could also serve as a useful tool for maintenance planning and risk management in photovoltaic power plants.For example, photovoltaic power plant operators can use accurate forecasts of plant performance to plan load and production balance in advance, which can lead to cost savings and increased system efficiency.
Further research will focus on improving the performance of the model by integrating additional data, such as production data from other renewable energy sources or electricity consumption data.This research will include the development of new methods for assessing and minimising forecast uncertainty, which could improve applicability in the field.
In conclusion, this study is an example of how deep learning can contribute to advances in the renewable energy field.While Deep Learning models have been widely used in other sectors for some time, their application in the context of photovoltaic power plants and power systems is still relatively new.It is likely that this area will continue to expand as the technology evolves to provide even greater benefits in optimising renewable energy generation and energy system sustainability.

Figure 1 .
Figure 1.The power output of the photovoltaic power plant located at the university campus.The cloud edge effect is visible on days with cloud cover: the power output of the PV power plant reaches significantly higher values compared to the same time of day under cloudless conditions.

Figure 1 .
Figure 1.The power output of the photovoltaic power plant located at the university campus.The cloud edge effect is visible on days with cloud cover: the power output of the PV power plant reaches significantly higher values compared to the same time of day under cloudless conditions.

Figure 2 .
Figure 2. Cloud motion vector method for short-term forecasting.

Figure 2 .
Figure 2. Cloud motion vector method for short-term forecasting.

Figure 3 .
Figure 3.The impact of time resolution in data retrieval on database accuracy [24].A comparison of data retrieved with dt 45 s and a 5-min interval.The display is for two characteristic days; (a) partially cloudy day; (b) overcast day.

Figure 3 .
Figure 3.The impact of time resolution in data retrieval on database accuracy [24].A comparison of data retrieved with dt 45 s and a 5-min interval.The display is for two characteristic days; (a) partially cloudy day; (b) overcast day.

Energies 2023 ,
16, 5428 6 of 18 affect the prediction accuracy of the model for future photovoltaic power plant production.Crucial details such as the extent of cloud cover, the position of clouds, and the direction and speed of cloud movement are preserved even in these lower-resolution images.This lower resolution allows the model to efficiently predict energy production by retaining key image features.Moreover, the lower-resolution photos help to minimise noise and irrelevant cloud details that could potentially interfere with the model, allowing it to focus on the most relevant information.As a result, training efficiency improves by reducing the volume of data processed during training, resulting in faster training times and a more effective model.Energies 2023, 16, x FOR PEER REVIEW 6 of 18 color spectrum appears to result in considerable information loss, it does not noticeably affect the prediction accuracy of the model for future photovoltaic power plant production.Crucial details such as the extent of cloud cover, the position of clouds, and the direction and speed of cloud movement are preserved even in these lower-resolution images.This lower resolution allows the model to efficiently predict energy production by retaining key image features.Moreover, the lower-resolution photos help to minimise noise and irrelevant cloud details that could potentially interfere with the model, allowing it to focus on the most relevant information.As a result, training efficiency improves by reducing the volume of data processed during training, resulting in faster training times and a more effective model.

Figure 4 .
Figure 4. Preprocessing of a randomly selected photograph: (a) The original photo; (b) reduction of resolution to 64 × 64 pixels; (c) conversion to grayscale format.

Figure 4 .
Figure 4. Preprocessing of a randomly selected photograph: (a) The original photo; (b) reduction of resolution to 64 × 64 pixels; (c) conversion to grayscale format.

Figure 5 .
Figure 5. Structure of a typical Convolutional Neural Network.

Figure 5 .
Figure 5. Structure of a typical Convolutional Neural Network.

Figure 7 .
Figure 7.The structure of the developed hybrid model.

Figure 7 .
Figure 7.The structure of the developed hybrid model.Figure 7. The structure of the developed hybrid model.

Figure 7 .
Figure 7.The structure of the developed hybrid model.Figure 7. The structure of the developed hybrid model.

Figure 8 .
Figure 8. Summary of our combined CNN-LSTM model architecture, showing the sequence of layers, their output shapes, and the number of trainable parameters.

Figure 8 .
Figure 8. Summary of our combined CNN-LSTM model architecture, showing the sequence of layers, their output shapes, and the number of trainable parameters.

Figure 11 .
Figure 11.Model performance at different temporal resolutions of the input data.

Figure 10 .
Figure 10.Scatter plot depicting actual vs. predicted photovoltaic power output.During the modification of the temporal resolution of the input data, all other parameters within the model were kept constant.This approach ensured that any variations in model accuracy could be directly attributed to changes in temporal resolution.Reducing the temporal resolution of the input data results in a reduction in the amount of training data, resulting in a substantially shorter training time compared to the original model.The training times for the network and other performance evaluation metrics are shown in Figure 11.

Figure 11 .
Figure 11.Model performance at different temporal resolutions of the input data.Figure 11.Model performance at different temporal resolutions of the input data.

Figure 11 .
Figure 11.Model performance at different temporal resolutions of the input data.Figure 11.Model performance at different temporal resolutions of the input data.

Figure 12 .
Figure 12.Comparison of actual and predicted photovoltaic power on primarily cloudy days; (a) resolution of input data: 5 min; (b) resolution of input data: 15 min.

Figure 12 .
Figure 12.Comparison of actual and predicted photovoltaic power on primarily cloudy days; (a) resolution of input data: 5 min; (b) resolution of input data: 15 min.

Figure 12 .
Figure 12.Comparison of actual and predicted photovoltaic power on primarily cloudy days; (a) resolution of input data: 5 min; (b) resolution of input data: 15 min.

Figure 13 .
Figure 13.Performance of model across various forecast horizons.

4 .
Assessing Prediction Performance across Various Input Data LengthsThe LSTM model inherently captures production patterns that include annual, monthly, and daily cycles.The exact nature of these patterns within the model can be elusive as the model dynamically adapts, developing and associating patterns deemed most relevant to achieving highly accurate forecasts.It is important to emphasise that each new training iteration of the model can lead to different results.This variability is due to the stochastic adjustments made to the model during training, resulting in unique models after each training process.

Figure 14 .
Figure 14.Effect of input data length on forecast performance.Figure 14.Effect of input data length on forecast performance.

Figure 14 .
Figure 14.Effect of input data length on forecast performance.Figure 14.Effect of input data length on forecast performance.

Figure 14 .
Figure 14.Effect of input data length on forecast performance.

Figure 15 .
Figure 15.Impact of input data length on forecasting: The inability to predict the first two hours of production with a two-hour photo sequence requirement.

Figure 15 .
Figure 15.Impact of input data length on forecasting: The inability to predict the first two hours of production with a two-hour photo sequence requirement.