The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks

Sayenko, Yuriy; Pawelek, Ryszard; Baranenko, Tetiana; Liubartsev, Vadym

doi:10.3390/en18092309

Open AccessArticle

The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks

¹

Institute of Electrical Power Engineering, Lodz University of Technology, 20 Stefanowskiego Street, 90-537 Lodz, Poland

²

Department of Automation Electrical Systems and Electric Drive, Pryazovskyi State Technical University, 19 Dmytro Yavornytskyi Avenue, 49005 Dnipro, Ukraine

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(9), 2309; https://doi.org/10.3390/en18092309

Submission received: 1 April 2025 / Revised: 22 April 2025 / Accepted: 29 April 2025 / Published: 30 April 2025

(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The resolution of key tasks related to energy resource conservation and the enhancement of energy security is not possible without the widespread use of the electricity generated by renewable energy sources (RES), including, among others, photovoltaic (solar) power plants. A negative aspect of connecting renewable energy sources to power grids is the challenge of forecasting their generation, which consequently makes it difficult to plan the stable operation of power systems with distributed generation. To improve the accuracy of forecasting electricity generation by solar power plants, this article proposes the use, in addition to meteorological parameters such as air temperature and wind speed, of a new indicator defined as the unit active power generation (P*). This indicator can be used in the development of mathematical models and methods to increase the energy efficiency of power systems utilizing renewable energy sources, particularly to enhance the accuracy of power generation forecasting processes using neural networks and machine learning. The article shows that the use of this indicator allows for increasing the accuracy of forecasting energy production by solar power plants.

Keywords:

solar power plants; meteorology; solar radiation intensity; forecasting; correlation; modeling

1. Introduction

The development of renewable energy sources (RES) in the European Union (EU) plays a crucial role in the transition to a sustainable and low-emission economy. The EU is striving to significantly reduce greenhouse gas emissions and achieve carbon neutrality by 2050, which requires a substantial increase in the share of renewable energy in the energy mix. Key areas include the development of solar and wind energy, which have immense potential to meet Europe’s growing energy needs without harming the environment. Moreover, the introduction of RES helps reduce economic dependence on imported fossil fuels, which is strategically important for the region’s energy security.

In addition, investments in renewable energy sources create new jobs and help stimulate economic development, especially in non-urbanized (remote and rural) areas. Technological innovations in RES not only improve their efficiency and reliability but also make them more competitive with traditional energy sources. Support from the European Commission and national governments in creating favorable legislation plays a vital role in accelerating the integration of RES into the energy system. This allows for more effective use of the RES potential and minimizes financial losses associated with generation constraints and energy system imbalances. Thus, the development of RES in the EU is an important step towards a sustainable and environmentally friendly future.

One of the problems, in addition to the large financial losses of the state (using Ukraine as an example), is the responsibility of renewable energy sources for imbalances—deviations of actual electricity production from the forecast. Unlike traditional energy sources (e.g., coal or nuclear power plants), it is quite difficult to forecast electricity production from RES due to the variability of weather conditions. Modern forecasting systems predict hourly production from RES with an average error of around 10–20%. Adopting a more decentralized and transparent approach to organized trade and regional balancing could eliminate inefficiencies in the centralized energy market model and improve its functionality [1].

An analysis of the Electricity Market Act [2] highlights the importance of accurate forecasting of RES generation to reduce losses for power plant owners and ensure the reliability of the grid. Therefore, the issue of more accurate forecasting of electricity production from RES, both large power plants (solar and wind farms) and small, private generation units (prosumers), becomes pressing. More accurate forecasting will allow grid operators to efficiently utilize the potential of RES without resorting to generation limitations. It will also help reduce the need to maintain outdated coal-fired power plants in “hot” reserve and increase energy efficiency in networks through more precise planning of power flows from various RES.

The development of RES and the introduction of cheaper, sustainable and environmentally friendly alternatives to traditional energy sources provide the countries affected by fossil fuel shortages with an opportunity to reduce energy dependency [3].

Current research related to photovoltaic (PV) energy generation forecasting emphasizes the use of meteorological data to improve the forecast accuracy [4,5]. The forecasting methodology for PV sources is based on the use of the design of experiments (DOE) method [6], which can significantly reduce the number of experimental runs while ensuring high decision-making accuracy and reliability. A review of the application of machine learning (ML) and deep learning (DL) methods for PV power forecasting shows that while ML methods have been widely studied, the use of DL methods for this purpose is limited [7].

The systematic review of current trends in photovoltaic power forecasting technologies highlights the need to develop highly accurate forecasting models with minimal dependence on weather conditions [8]. The application of ML and artificial intelligence (AI) methods in this area can improve the forecast accuracy. The work [9] presents a method for short-term PV power forecasting based on recurrent neural networks (RNN). The model shows high forecast accuracy in 15 min and 30 min horizons, indicating the high stability and reliability of the proposed method for real PV power plants.

The work [10] analyzes various methods for forecasting PV power, comparing them in terms of the prediction method, time horizon, measurement error and computational cost. Artificial neural network (ANN) and support vector machine (SVM) methods are most effective in solving complex nonlinear predictive models. In addition to using ANN alone for forecasting electricity generation by PV plants suggest the use of hybrid models combining both ANN and statistical methods to forecast solar radiation intensity [11]. These methods have shown the best performance compared to traditional methods, especially for sunny days.

The use of ANN for forecasting electricity imbalances, presented, showed that the long short-term memory (LSTM) model has the lowest error values compared to other methods, indicating the high effectiveness of the proposed approach [12]. Unlike traditional RNN, LSTM networks can store information over long intervals, which is achieved through the use of special mechanisms such as memory cells and control gates.

Deep learning artificial neural network architecture was proposed for predicting the amount of electricity supply by renewable energy producers, as well as the upper and lower limits of the forecasting interval, the characteristic feature of which is the use of auto-coding blocks with short connections [13]. This allowed the reduction of the average forecast error for the upcoming day to 4.46% and the maximum one to 12.81%.

At the same time, a significant part of the research is devoted to the linking of meteorological parameters with the production of electricity by solar power plants. For example, a study on the relationship between meteorological variables and the output power of a PV plant presented in the paper [14] found a strong correlation between temperature, solar radiation intensity and installation efficiency. The use of dimensionality reduction techniques, such as feature selection and principal component analysis, can significantly reduce computation time while maintaining high model accuracy. Additionally, an innovative approach to PV power forecasting based on satellite images described in the paper [15], utilizing a model that accounts for the nonlinear movement of clouds, allows for more accurate predictions of their trajectories. The work [16] uses an artificial neural network to predict generation based on various weather parameters. However, in the case of forecasting electricity generation using meteorological data, it is crucial to consider the potential for missing data. For example, studies have demonstrated that hybrid learning methods robust to data absence can maintain high forecasting accuracy even when faced with substantial amounts of incomplete data [17].

In addition to meteorological parameters, it is also proposed to use the characteristics of ground station locations for forecasting. For instance, based on a broad learning system and copula theory, it is proposed to improve the accuracy of distributed generation forecasting [18]. This approach enables a more precise consideration of time–space dependencies between installations, allowing for more effective accounting of interdependencies between the outputs of individual photovoltaic plants. In the paper [19], the importance of considering the geographical distribution of installations and the combination of different types of installations is emphasized. The work [20] proposes using a natural simulation model to improve the automatic PV generation forecasting system. This helps increase PV generation volumes in power systems and improves the management of distributed generation.

In addition to the technical aspects of PV plant operation forecasting, research is also being conducted on the integration of PV plants into the power grids of various countries. The results presented in the paper [21] showed that Germany and Spain have the highest forecast accuracy. Power forecasts enable grid operators and system designers to optimally plan plant operation and manage energy supply and demand. A study of the forecasting system in the German energy market indicates that improved forecasts for fundamental variables, such as electricity demand and solar and wind energy production, contributed to a 13.5% increase in revenue from market selection for energy sales [22]. Study [23] discussed the importance of PV energy generation forecasting for successful integration into the power grid and participation in energy markets.

Many of the analyzed studies present comparisons of different models for forecasting the operation of photovoltaic (PV) power plants. For example, an ensemble forecasting method based on energy consumption data was used for planning operations in the electricity market [24]. Future steps may involve integrating additional data, such as weather conditions, to improve the models. The comparison of LSTM and gated recurrent unit (GRU) neural network models for predicting PV power generation showed that GRU outperforms LSTM when using long-term training data [25]. The study presented in the paper [26] proposed a method for forecasting PV power in distribution networks with high PV saturation, based on the use of a small set of representative monitoring locations.

The analysis of the publications shows that PV generation forecasting depends on the accuracy of meteorological data and the use of advanced machine learning (ML) and deep learning (DL) methods. Future research should focus on developing highly accurate predictive models by considering and integrating additional data, such as weather conditions, to improve the forecast accuracy. It can be stated that hourly power generation forecasting using neural networks is characterized by high accuracy, and further studies may focus on reducing the computational resource requirements and training time for neural networks.

For example, the average error for the actual measured solar radiation intensity ranges from −0.97% to 4.91%, and for calculated values and weather forecasts, from −3.86% to 5.12% [27]. This results from the inaccuracies in weather forecasts and calculations of solar radiation intensity. ANN requires only 1000–2000 data samples, which reduces computational requirements and accelerates training. To improve forecasting accuracy, it is necessary to determine the optimal set of input data for ANN (primarily meteorological data) needed for high-quality forecasting of solar power plant operation, without overloading the models with unnecessary data that complicate the ANN training process.

2. Research Objectives and Analysis Method

The subject of the research presented in the article is the identification of significant meteorological parameters that influence the accuracy and reliability of forecasting electricity generated by photovoltaic power plants. It is obvious that not all meteorological parameters have the same impact on the production of electricity by solar power plants. Considering additional meteorological parameters when building a model forecasting electricity production not only complicates the operation of the neural network but also involves the need to obtain preliminary information about these parameters, install appropriate sensors and perform additional measurements. At the same time, it should also be taken into account that some meteorological parameters do not lead to a significant improvement in the quality of the forecast. An important factor influencing the error in forecasting electricity production is the correlation between production and various meteorological parameters.

To this end, this study employed the autocorrelation function and cross-correlation function between meteorological data and the performance of solar panels. The outcome of the research is the determination of an optimal set of input data (meteorological parameters) for constructing simple models dedicated to forecasting electricity production by solar power plants, without considering all meteorological parameters. The study also accounted for the impact of time delays in data series on the forecasting process, which will enable the development of more accurate and reliable predictive models. This approach allows for the assessment of the reliability of using solar energy as an alternative energy source and offers practical recommendations for improving the forecasting process and managing electricity production.

2.1. Method Description

The autocorrelation function assesses the extent to which the values of a time series depend on its own past values. It illustrates how the values of the series correlate with each other at different time lags. The autocorrelation function allows for the determination of the degree and direction of the linear relationship between two values of a random process at different times: t₁ and t₂.

For stationary, ergodic random processes, the autocorrelation function does not depend on the specific times t₁ and t₂ but only on the time lag τ = t₂ − t₁ between the ordinates of the active power P(t), and it can be determined based on a single measurement process over a sufficiently long period T [28]:

R (τ) = E ((P (t) - E_{P}) (P (t + τ) - E_{P}))

(1)

or

R (τ) = \lim_{T \to \infty} \frac{1}{T - τ} \int_{0}^{T - τ} (P (t) - E_{P}) (P (t + τ) - E_{P}) d t

(2)

where E_P is the expected value of the process P(t).

For a discrete random process P(t), Equation (2) takes the form of Equation (3), in which the value of the cross-correlation sequence R_P(m) at time shift m is given by [28]:

R_{P} (m) = \frac{1}{N - m} \sum_{n = 1}^{N - m} (P_{n} - P_{a v}) (P_{n + m} - P_{a v})

(3)

where P_n is the value of active power at time n and P_av is the average value of the generated active power P(n) over the analyzed time period.

Figure 1 shows an example of the realization of the random power generation process P(t), while an example of the autocorrelation function for this process is illustrated in Figure 2. The cross-correlation function determines the linear relationship between two random processes at different intervals. It shows how the values of one series correlate with the values of another series. The cross-correlation function extends the capabilities of correlation analysis by allowing consideration of time lags between changes in meteorological conditions and the response of solar panel power generation to them.

In the context of solar panel power generation forecasting, the cross-correlation function is used to analyze the effect of various meteorological factors, such as solar radiation, temperature, humidity and wind speed on the performance of solar panels. The cross-correlation function of the random processes of solar power plant power generation P(t) and a selected meteorological parameter X(t) is defined as [28]

R_{P X} (m) = \frac{1}{N - m} \sum_{n = 1}^{N - m} (P_{n} - E_{P}) (X_{n + m} - E_{X})

(4)

Figure 3 shows an example of the implementation of a random generation process P(t) and a random process of changes in a selected meteorological parameter X(t). In the analysis of random processes, a normalized cross-correlation function of the form is often used [28]:

r_{P X} (m) = \frac{R_{P X} (m)}{σ_{P} σ_{X}} = \frac{R_{P X} (m)}{\sqrt{D_{p}} \sqrt{D_{x}}}

(5)

where σ_p, σ_x are the standard deviations of the generated power P and the value of the meteorological parameter X, respectively, and D_P, D_X are the variations of the generated power P and the value of the meteorological parameter X.

For stationary random processes, the cross-correlation function at τ = 0 does not depend on the moment of time t and is the cross-correlation coefficient of these random processes [29]. The cross-correlation coefficient of the random generation process P(t) and the random process of changing the meteorological parameter X(t) has the form

r_{p x} = \frac{N \sum_{n = 1}^{N} (P_{n} X_{n}) - \sum_{n = 1}^{N} (P_{n}) \sum_{n = 1}^{N} (X_{n})}{\sqrt{(N \sum_{n = 1}^{N} P_{n}^{2} - {(\sum_{n = 1}^{N} P_{n})}^{2}) (N \sum_{n = 1}^{N} X_{n}^{2} - {(\sum_{n = 1}^{N} X_{n})}^{2})}}

(6)

where P_n is the value of the power generated by the solar power plant, X_n is the value of the corresponding meteorological parameter, and N determines the number of pairs of these values.

Using Pearson’s cross-correlation function to assess the relationship between generated active power and various meteorological parameters provides only an initial insight into the degree of their interdependence. This method has limitations, as it only evaluates linear dependencies. In solar generation, the relationship between output power and climate variables, such as solar radiation, temperature and humidity, is often nonlinear.

To refine the assessment of how these nonlinear dependencies affect forecasting accuracy, Spearman’s rank correlation method can be applied. This method evaluates monotonic dependencies between variables and is less sensitive to outliers. To assess the monotonic dependence between variables X (meteorological data) and P (generation), the following formula is used [30]:

ρ = \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)},

(7)

where d_i—the difference between the ranks of X and P for each observation; n—the number of observations.

2.2. Measurement Data

To determine the relationship between electricity production and meteorological data, the article uses a set of measurement data for January–December 2021 obtained from one of Ukraine’s photovoltaic farms, with an installed capacity of 320 MW. The basic data containing information on electricity production expressed in MWh are presented in 1 h increments, which corresponds to the average power at that time. Meteorological data obtained from a weather station located at the solar power plant site are averaged over 5 min intervals. To match them with power generation data, the meteorological data were averaged and brought to a discreteness of 1 h. The parameters for which correlations with electricity production data were evaluated is shown below in the Table 1.

3. Research Results

In the first stage of the analysis, correlation coefficients were determined between the daily vectors of hourly values of individual meteorological parameters, e.g., temperature [T1, T2, …, T24] and the generated active power [P1, P2, …, P24]. The results of the calculations, averaged for the period of the year, are shown in Table 1. Spearman’s rank correlation coefficients were calculated for all meteorological parameters, similar to the Pearson correlation coefficient calculations (Table 1). The results showed that Spearman’s correlation between active power and meteorological parameters was higher than Pearson’s coefficient, confirming the presence of a nonlinear relationship.

As can be seen from this table, the highest correlation with active power generation is shown by the parameters related to the intensity of solar radiation (about 0.92), the parameters of time and panel temperature (0.82), as well as air temperature (0.54) and wind speed (0.56). Example measurement results illustrating the daily relationship between changes in various meteorological parameters and active power generated by a solar power plant are shown in Figure 4, Figure 5, Figure 6 and Figure 7.

Since solar power generation depends on the time of day, in assessing the correlation coefficient of generation with the time of day, a modified time scale was introduced that more accurately reflects the physical nature of the process. The transformation (conversion) of the time scale was carried out in such a way that an increasing countdown from 0 to 12 corresponds to an increase in solar intensity, and then a decreasing countdown from 12 to 0, reflects a decrease in intensity in the afternoon. This approach allows for a linear representation of the relationship between the time of day and the level of power generation, eliminating any non-linearities associated with traditional time measurement. This methodology allows for a more accurate representation of the nature of solar power generation as a function of time of day, which in turn can improve the accuracy of forecasting and modeling in photovoltaic power systems.

The main reasons for the high level of the correlation coefficient of the time of day with the indicator under consideration (0.82) can be associated with the cyclicality and predictability of daily changes in meteorological parameters. The time of day has a significant impact on many factors, such as the intensity of solar radiation and temperature, which in turn can significantly affect the level of power generation by the solar power plant.

An important role is also played by the temperature of the panels, for which the correlation coefficient is r = 0.82. The temperature of the panels affects their performance. For example, in the case of solar panels, their temperature directly affects the efficiency and effectiveness of converting solar energy into electricity. However, it must be taken into account that a higher panel temperature can also lead to a reduction in the efficiency of the solar power plant’s power generation (Figure 4).

At the same time, air temperature, which has a correlation coefficient of 0.54, has a moderate effect on the analyzed parameters. First of all, air temperature alone does not affect the generation of power by the solar power plant and is only an intermediate parameter that depends on the intensity of solar radiation. Figure 5 shows a graph illustrating the variation of air temperature and generated active power, where the maximum generation of active power occurs at 11:00 ÷ 12:00, and the maximum air temperature occurs at 14:00 ÷ 15:00. This is due to the fact that air has thermal inertia, so it takes time to warm up after reaching the maximum intensity of solar radiation.

Wind speed also has a significant influence, with a correlation coefficient of 0.56. Wind can cause a variety of effects, such as cooling the surface of the panels or changing the heat exchange with the environment, which affects the parameters studied. However, the mechanism of the effect of this parameter on active power generation requires further research in this direction. A graph illustrating the changes in active power generation and wind speed is shown in Figure 6.

In addition, the reason for the negative value of the correlation coefficient for some meteorological parameters, (e.g., r = −0.48 for air humidity), is clear. Figure 7 shows a graph illustrating the daily variation of active power generation of the relationship between generation and air humidity, which shows a decrease in humidity levels as generation increases. Of course, this parameter is not directly related to generation, but indirectly indicates the level of solar radiation and the associated air temperature, which reduce the humidity level.

The values of the correlation coefficients given in Table 1 do not properly reflect the real impact of meteorological parameters on electricity production, as it is clear that the correlation of active power generation with the intensity of solar radiation dominates. It should be noted that meteorological parameters (air temperature, panel temperature, air humidity and wind speed) depend directly on the intensity of solar radiation. So, for example, the correlation of panel temperature or air temperature is part of the correlation between active power generation and solar radiation.

In order to eliminate the influence of solar radiation intensity on the values of other meteorological parameters, it was decided to seek a correlation of the unit active power generation (P*), derived from the current solar radiation intensity (kW/(W/m²)), with these parameters, as a more objective assessment of the correlation relationship. The unit active power generation is defined by the formula:

P^{*} = \frac{P}{G}

(8)

where P is the generated active power (kW), and G is the solar radiation (W/m²).

Figure 8 shows a combined graph of solar power plant power generation, solar radiation and unit active power generation. This graph confirms the close dependence of generation on solar radiation, resulting in a high coefficient of their correlation (Table 1). At the same time, it can be noted that the level of unit active power generation (P*) practically does not change during the period of operation of the power plant (from sunrise to sunset), since the generation and intensity of solar radiation change similarly during the day.

The use of this indicator (P*) allows for the determination of more realistic correlation coefficients between meteorological parameters and electricity production. Table 2 shows the results of calculating the correlation coefficients of meteorological parameters with unit active power generation (P*).

The correlation coefficients shown in Table 2 have lower values than those in Table 1 and better reflect the relationship between selected meteorological parameters and unit active power generation. The highest correlation coefficients for the studied period are the time of day (0.71), air temperature (0.35), panel temperature (0.57) and wind speed (0.42). In addition to the meteorological parameters presented, there are also parameters that require more detailed study, such as the intensity and amount of precipitation (r = 0), as well as the degree of pollution of photovoltaic panels (r = −0.14). Their influence on the generation of energy by the solar power plant in this study is practically absent.

The dependence of the unit active power generation P*(t) during the day is not unchanged, although it changes to a much lesser extent than generation P(t) (Figure 8). This is due to the fact that the nature of changes in the unit active power generation of P*(t) is influenced by virtually all meteorological parameters, with the exception of solar radiation. The stronger the correlation between meteorological parameters and power generation, the greater the changes in the unit active power generation.

To evaluate the influence of correlation coefficients on the quality of forecasting of solar power plant generation (SPP), an artificial neural network was created, the structure of which is shown in Figure 9. The type and structure, as well as the parameters, of the ANN were selected based on the experience of previous studies [27].

Neural Network Parameters:

–: Type—feedforward network;
–: Number of neurons in the hidden layer—30;
–: Hidden layer activation function—hyperbolic tangent sigmoid;
–: Output layer activation function—linear;
–: Learning algorithm—Levenberg–Marquardt;
–: Performance evaluation function—mean squared error;
–: Maximum number of learning epochs—1000;
–: Minimum gradient to stop learning—1 × 10⁻⁵.

The work takes into account that in real conditions, meteorological monitoring data may be incomplete or contain gaps. To ensure the correct operation of the neural network in the event of missing data, the interquartile range (IQR) method is used, the essence of which is to divide the data sample into quartiles:

–: Q1 (first quartile)—the value below which 25% of the sample is located;
–: Q3 (third quartile)—the value below which 75% of the sample is located;
–: IQR = Q3 − Q1—the interquartile range.

A value is considered abnormal if it is less than Q1 − 1.5 IQR or greater than Q3 + 1.5 IQR.

In the next step, the “outlier” is replaced by the average value of neighbouring points (rolling average). In addition, the architecture of the neural network itself has the ability to generalize, which allows the reduction of the sensitivity of the model to noise and random fluctuations in the data.

Next, min–max normalization of the intensity indicators of the input parameters and the active power generated by the solar power plant is performed. This allows the elimination of discrepancies in the data scales and reduction of the impact of outliers (anomalous values), which can be caused by measurement errors or instability of weather conditions:

X_{scaled} = \frac{X - X_{\min}}{X_{\max} - X_{\min}} .

(9)

Today, as a rule, the Mean Average Percentage Error (MAPE) method prevails in assessing the accuracy of forecasting electricity production and consumption. The formula for MAPE includes the calculation of the relative error for each value, which is expressed as the proportion of the absolute error and the actual value [31]:

M A P E = \frac{1}{n} \sum_{i = 1}^{n} |\frac{{P^{'}}_{i} - P_{i}}{P_{i}}| \cdot 100 %,

(10)

where n is the size of the data sample (24 h); P_i—real value; P′_i—predicted value.

The use of the MAPE metric to assess forecasting accuracy has a significant drawback in cases where actual values are close to zero. When P_i approaches zero, the denominator becomes very small, which leads to a sharp increase in the relative error. This creates a distortion in the overall estimate, as even small absolute deviations can result in very high relative error values. As a result, the MAPE metric becomes unsuitable for analyzing data that contain low or zero values. To address this issue, we propose an approach that uses error normalization by dividing MAE (Mean Absolute Error) by the average power consumption to avoid this drawback [31]:

E R R O R = \frac{M A E}{\frac{1}{n} \sum_{1}^{24} P_{i}} \cdot 100 %,

(11)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |{P^{'}}_{i} - P_{i}|

(12)

This approach ensures the stability of the estimate, since the normalization is performed relative to the average level of energy consumption and not to each individual value. Thus, the large error near zero values of the actual consumption does not dominate the overall estimate, making the metric more stable and adequate for use in electricity consumption forecasting.

The Table 3 shows the forecast errors in %, is calculated by the Formula (10), for different sets of input data. The last two columns present the average value of the forecast errors for 14 days and improving the forecast by using the unit active power generation.

The results show that taking into account meteorological parameters with a high correlation coefficient (≥0.3), such as panel temperature, wind speed or humidity, leads to a decrease in the error in forecasting electricity generation. Moreover, adding unit active power generation to any set of meteorological parameters leads to an additional decrease in the error (Figure 10 and Figure 11).

This is due to the fact that the unit active power generation indirectly takes into account most of these parameters, reflecting the integral influence of environmental conditions. This is especially pronounced in the case where the unit active power generation is added to a small set of meteorological data (Table 3, items 1–5). When adding the unit active power generation to a large set of data (Table 3, item 6), the error decreases to a much lesser extent (Figure 12 and Figure 13).

This is due to the fact that unit active power generation already takes into account the impact of all these meteorological parameters on electricity generation. The reduction in the error from 1.53% to 1.35% is due to the fact that unit active power generation also takes into account other meteorological parameters that were not taken into account in forecasting.

The unit active power generation is actually a parameter that indirectly takes into account most meteorological parameters, reflecting the integral influence of environmental conditions on electricity generation. This emphasizes the importance of this option because it allows us to account for hidden dependencies and implicit influences that are difficult to incorporate into the model in any other way.

Thus, to achieve an optimal balance between model complexity and forecast accuracy, it is recommended to use a set of parameters that includes insolation, time and unit active power generation. The consideration of other meteorological parameters may be justified only in specific conditions or to improve forecasting in short-term intervals.

The proposed forecasting method takes into account the influence of solar radiation by using the unit active power generation (P*), which allows the model to be adaptive to changes in input meteorological conditions. However, it is worth noting that geographical location and seasonal fluctuations can indeed affect the accuracy of the forecast. In such cases, slight deviations in the accuracy of the forecast are possible, especially under conditions of a sharp deterioration in weather conditions, which are difficult to predict based on standard meteorological data. To minimize the influence of these factors, the study used the following:

–: Adaptation of the model to local climatic features by retraining on regional samples;
–: Additional meteorological parameters (temperature, humidity, cloudiness) to increase accuracy;
–: Constant retraining of the ANN to ensure adaptability.

4. Conclusions

The study of correlation relationships between meteorological parameters and power generation by photovoltaic power plants confirms the significant role of meteorological factors, such as air and photovoltaic panel temperature, wind speed and time of day, in the electricity generation process. It is important to emphasize that the primary parameter influencing electricity generation by SPP is the level of solar radiation intensity. However, adding meteorological parameters can allow for more accurate forecasting of electricity production.

Currently, traditional forecasting models often focus solely on solar radiation, ignoring meteorological variables that also have a significant impact on electricity generation by solar power plants. The aim of estimating correlation coefficients between active power generation and meteorological parameters is to determine whether and to what extent these parameters should be included in forecasting models for solar power generation. The outcome of this analysis should be an optimal set of parameters used to build predictive models.

The inclusion of meteorological data in the model for forecasting SPP generation is justified only in the presence of moderate or high correlation of meteorological parameters with the output value, which is confirmed by correlation coefficients above 0.3–0.4. This approach allows us to reduce the volume of input data, avoiding taking into account factors that do not significantly affect the forecast accuracy. This is especially important for simplifying models and minimizing computational costs when training neural networks.

Analyzing the results obtained, it was found that in the correlations of meteorological parameters (e.g., panel temperature or air temperature), there is a component related to the dependence of both power generation and solar radiation intensity. In order to isolate the influence of meteorological parameters (excluding the influence of solar radiation intensity), the article proposes a new indicator P* defined by Formula (8), which allows for moving from using absolute values of solar radiation intensity to unit active power production per unit of solar radiation. The transition from using generated active power values to unit active power generation (P*) as a more objective assessment of correlation dependence allowed for moving away from the dominant correlation with solar radiation intensity and focusing on parameters that directly (time) and indirectly (air temperature, panel temperature, air humidity) affect the evaluation of the impact on electricity production by photovoltaic panels.

It should be emphasized that determining the value of this indicator for a specific solar power plant is relatively easy, because the both values of generated active power and the intensity of solar radiation are continuously monitored at its location. Using the indicator of unit active power generation to forecast electricity production significantly increases the accuracy of forecasts, which is illustrated by the calculation results in Table 3.

The results of this study are important because they allow for an optimal approach to building neural network models for forecasting electricity production from photovoltaic panels. They help avoid the use of unnecessary data, prevent computational overload during calculations and improve the quality and accuracy of forecasting. This is achieved by using input parameters that have a strong correlation with the target values and clearly influence them.

Author Contributions

Conceptualization, Y.S., V.L. and R.P.; methodology, Y.S., V.L. and R.P.; software, V.L. and Y.S.; validation, Y.S. and R.P.; formal analysis, Y.S. and R.P.; investigation, V.L. and Y.S.; simulation, V.L. and T.B.; data curation, V.L. and T.B.; writing—original draft preparation, Y.S., V.L. and R.P.; writing—review and editing, Y.S. and R.P.; visualization, V.L. and T.B.; supervision, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AI	Artificial Intelligence
ANN	Artificial Neural Network
DL	Deep Learning
DOE	Design of Experiments
EU	European Union
GRU	Gated Recurrent Unit
IQR	Interquartile Range
LSTM	Long Short-Term Memory
ML	Machine Learning
PPP	Photovoltaic Power Plant
PV	Photovoltaic
RES	Renewable Energy Sources
RNN	Recurrent Neural Networks
SPP	Solar Power Plant
SVM	Support Vector Machine

References

Osińska, M.; Kyzym, M.; Khaustova, V.; Ilyash, O.; Salashenko, T. Does the Ukrainian Electricity Market Correspond to the European Model? Util. Policy 2022, 79, 101436. [Google Scholar] [CrossRef]
Lezhnyuk, P.; Komar, V.; Kravchuk, S.; Lesko, V.; Netrebskiy, V. Meteorological Parameters Analysis for Hourly Forecast of Electricity Generation by Photovoltaic Power Station on the Day Ahead. In Proceedings of the 2018 IEEE 3rd International Conference on Intelligent Energy and Power Systems (IEPS), Kharkiv, Ukraine, 10–14 September 2018; pp. 235–238. [Google Scholar]
Szeberényi, A.; Bakó, F. Electricity Market Dynamics and Regional Interdependence in the Face of Pandemic Restrictions and the Russian–Ukrainian Conflict. Energies 2023, 16, 6515. [Google Scholar] [CrossRef]
Theocharides, S.; Makrides, G.; Livera, A.; Theristis, M.; Kaimakis, P.; Georghiou, G.E. Day-Ahead Photovoltaic Power Production Forecasting Methodology Based on Machine Learning and Statistical Post-Processing. Appl. Energy 2020, 268, 115023. [Google Scholar] [CrossRef]
Tawn, R.; Browell, J. A Review of Very Short-Term Wind and Solar Power Forecasting. Renew. Sustain. Energy Rev. 2022, 153, 111758. [Google Scholar] [CrossRef]
Moreira, M.O.; Balestrassi, P.P.; Paiva, A.P.; Ribeiro, P.F.; Bonatto, B.D. Design of Experiments Using Artificial Neural Network Ensemble for Photovoltaic Generation Forecasting. Renew. Sustain. Energy Rev. 2021, 135, 110450. [Google Scholar] [CrossRef]
Mellit, A.; Massi Pavan, A.; Ogliari, E.; Leva, S.; Lughi, V. Advanced Methods for Photovoltaic Output Power Forecasting: A Review. Appl. Sci. 2020, 10, 487. [Google Scholar] [CrossRef]
Iheanetu, K.J. Solar Photovoltaic Power Forecasting: A Review. Sustainability 2022, 14, 17005. [Google Scholar] [CrossRef]
Li, G.; Wang, H.; Zhang, S.; Xin, J.; Liu, H. Recurrent Neural Networks Based Photovoltaic Power Forecasting Approach. Energies 2019, 12, 2538. [Google Scholar] [CrossRef]
Sobri, S.; Koohi-Kamali, S.; Rahim, N.A. Solar Photovoltaic Generation Forecasting Methods: A Review. Energy Convers. Manag. 2018, 156, 459–497. [Google Scholar] [CrossRef]
Zafar, R.; Vu, B.H.; Husein, M.; Chung, I.-Y. Day-Ahead Solar Irradiance Forecasting Using Hybrid Recurrent Neural Network with Weather Classification for Power System Scheduling. Appl. Sci. 2021, 11, 6738. [Google Scholar] [CrossRef]
Blinov, I.; Miroshnyk, V.; Sychova, V. Short-Term Forecasting of Electricity Imbalances Using Artificial Neural Networks. IOP Conf. Ser. Earth Environ. Sci. 2023, 1254, 012029. [Google Scholar] [CrossRef]
Miroshnyk, V.; Shymaniuk, P.; Sychova, V. Short Term Renewable Energy Forecasting with Deep Learning Neural Networks. In Power Systems Research and Operation; Kyrylenko, O., Zharkin, A., Butkevych, O., Blinov, I., Zaitsev, I., Zaporozhets, A., Eds.; Studies in Systems, Decision and Control; Springer International Publishing: Cham, Switzerland, 2022; Volume 388, pp. 121–142. ISBN 978-3-030-82925-4. [Google Scholar]
Ziane, A.; Necaibia, A.; Sahouane, N.; Dabou, R.; Mostefaoui, M.; Bouraiou, A.; Khelifi, S.; Rouabhia, A.; Blal, M. Photovoltaic Output Power Performance Assessment and Forecasting: Impact of Meteorological Variables. Sol. Energy 2021, 220, 745–757. [Google Scholar] [CrossRef]
Si, Z.; Yang, M.; Yu, Y.; Ding, T. Photovoltaic Power Forecast Based on Satellite Images Considering Effects of Solar Position. Appl. Energy 2021, 302, 117514. [Google Scholar] [CrossRef]
Munir, M.A.; Khattak, A.; Imran, K.; Ulasyar, A.; Khan, A. Solar PV Generation Forecast Model Based on the Most Effective Weather Parameters. In Proceedings of the 2019 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), Swat, Pakistan, 24–25 July 2019; pp. 1–5. [Google Scholar] [CrossRef]
Liu, W.; Ren, C.; Xu, Y. Missing-Data Tolerant Hybrid Learning Method for Solar Power Forecasting. IEEE Trans. Sustain. Energy 2022, 13, 1843–1852. [Google Scholar] [CrossRef]
Zhou, N.; Xu, X.; Yan, Z.; Shahidehpour, M. Spatio-Temporal Probabilistic Forecasting of Photovoltaic Power Based on Monotone Broad Learning System and Copula Theory. IEEE Trans. Sustain. Energy 2022, 13, 1874–1885. [Google Scholar] [CrossRef]
Koivisto, M.; Cutululis, N.; Ekstrom, J. Minimizing Variance in Variable Renewable Energy Generation in Northern Europe. In Proceedings of the 2018 IEEE International Conference on Probabilistic Methods Applied to Power Systems (PMAPS), Boise, ID, USA, 24–28 June 2018; pp. 1–6. [Google Scholar] [CrossRef]
Lezhniuk, P.; Komar, V.; Hunko, I.; Jarykbassov, D.; Tussupzhanova, D.; Yeraliyeva, B.; Katayev, N. Natural-Simulation Model of Photovoltaic Station Generation in Process of Electricity Balancing in Electrical Power System. Inform. Autom. Pomiary Gospod. Ochr. Sr. 2022, 12, 40–45. [Google Scholar] [CrossRef]
Zsiborács, H.; Vincze, A.; Pintér, G.; Baranyai, N.H. The Accuracy of PV Power Plant Scheduling in Europe: An Overview of ENTSO-E Countries. IEEE Access 2023, 11, 74953–74979. [Google Scholar] [CrossRef]
Maciejowska, K.; Nitka, W.; Weron, T. Enhancing Load, Wind and Solar Generation for Day-Ahead Forecasting of Electricity Prices. Energy Econ. 2021, 99, 105273. [Google Scholar] [CrossRef]
Jasiński, M.; Leonowicz, Z.; Jasiński, J.; Martirano, L.; Gono, R. PV Advancements & Challenges: Forecasting Techniques, Real Applications, and Grid Integration for a Sustainable Energy Future. In Proceedings of the 2023 IEEE International Conference on Environment and Electrical Engineering and 2023 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe), Madrid, Spain, 6–9 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
Luis, G.; Esteves, J.; Da Silva, N.P. Energy Forecasting Using an Ensamble of Machine Learning Methods Trained Only with Electricity Data. In Proceedings of the 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe), The Hague, The Netherlands, 26–28 October 2020; pp. 449–453. [Google Scholar] [CrossRef]
Gao, Y.; Qi, S.; Ponoćko, J. Assessing Critical Data Types for Deep Leaming-Based PV Generation Forecasting. In Proceedings of the 2023 IEEE Belgrade PowerTech, Belgrade, Serbia, 25–29 June 2023; pp. 1–6. [Google Scholar] [CrossRef]
Yildiz, U.; Gol, M. Solar Power Generation Prediction for Distribution Systems with High PV Penetration. In Proceedings of the 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe), The Hague, The Netherlands, 26–28 October 2020; pp. 389–393. [Google Scholar] [CrossRef]
Sayenko, Y.; Baranenko, T.; Liubartsev, V. Forecasting of Electricity Generation by Solar Panels Using Neural Networks with Incomplete Initial Data. In Proceedings of the 2020 IEEE 4th International Conference on Intelligent Energy and Power Systems (IEPS), Istanbul, Turkey, 7–11 September 2020; pp. 140–143. [Google Scholar] [CrossRef]
Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control, 5th ed.; Wiley: Hoboken, NJ, USA, 2015. [Google Scholar]
Puccetti, G. Measuring Linear Correlation between Random Vectors. Inf. Sci. 2022, 607, 1328–1347. [Google Scholar] [CrossRef]
Field, A. Discovering Statistics Using IBM SPSS Statistics, 4th ed.; SAGE Publications: London, UK, 2013. [Google Scholar]
Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice, 2nd ed.; OTexts: Melbourne, Australia, 2018; Available online: https://otexts.com/fpp2/ (accessed on 22 April 2025).

Figure 1. An example of the realization of the random power generation process P(t) by a solar power plant.

Figure 2. An example of the autocorrelation function of a random power generation process P(t) by a solar power plant.

Figure 3. An example of the implementation of a random process of active power generation P(t) and a random process of changing a selected meteorological parameter X(t).

Figure 4. A graph illustrating the daily relationship between changes in panel temperature and active power generated by a solar power plant.

Figure 5. Graph illustrating the daily relationship between changes in air temperature and active power generated by a solar power plant.

Figure 6. Graph illustrating the daily relationship between changes in wind speed and active power generated by a solar power plant.

Figure 7. Graph illustrating the daily relationship between changes in humidity and active power generated by a solar power plant.

Figure 8. Graph illustrating the daily relationship between changes in generated active power P, solar radiation G and unit active power generation P*.

Figure 9. Artificial neural network model for predicting SPP generation with different set of input data.

Figure 10. Comparison of real and forecasted PV active power generation based on insolation.

Figure 11. Comparison of real and forecasted PV active power generation based on insolation with the unit active power generation.

Figure 12. Comparison of real and forecasted PV active power generation based on all data except the unit active power generation.

Figure 13. Comparison of real and forecasted PV generation based on all data, including the unit active power generation.

Table 1. Average (monthly) cross-correlation coefficients (r) and Spearman’s correlation (ρ) between meteorological parameters (X) and active power generation (P).

Parameter	The Value of the Coefficient (r)	Spearman’s Correlation (ρ)
Time	0.82	0.90
Air temperature	0.54	0.62
Dew point	0.30	0.39
Air humidity	−0.48	−0.52
Relative air pressure	0.21	0.34
Wind speed	0.56	0.64
Wind direction	−0.11	−0.15
Average solar radiation	0.92	0.95
Minimum solar radiation	0.92	0.95
Maximum solar radiation value	0.91	0.95
Pollution level of solar panels	−0.12	−0.17
Panel temperature	0.82	0.91

Table 2. Average (monthly) correlation coefficients between meteorological parameters and unit active power generation (P*).

Parameter	The Value of the Coefficient (r)	Spearman’s Correlation (ρ)
Time	0.71	0.75
Air temperature	0.35	0.38
Dew point	0.27	0.32
Air humidity	−0.28	−0.32
Relative air pressure	0.20	0.24
Wind speed	0.42	0.47
Wind direction	−0.10	−0.13
Pollution level of solar panels	−0.14	−0.18
Panel temperature	0.57	0.63

Table 3. Value of forecast errors (%) for 14 days.

No	Dataset/Days	1	2	3	4	5	6	7	8	9	10	11	12	13	14	14-Day Average Error [%]	Forecast Improvement [%]
1	Insolation	10.35	39.21	14.77	10.37	10.86	16.16	18.98	11.40	22.47	11.47	8.95	10.76	9.08	18.98	15.27	37.79
1	Insolation with unit active power generation	7.44	10.98	8.73	7.35	3.87	9.68	20.13	5.83	14.20	6.53	3.42	7.85	6.91	20.13	9.50	37.79
2	Insolation, time	11.13	16.83	9.64	12.25	6.92	13.38	20.59	7.70	23.86	11.01	4.57	14.72	14.27	20.59	13.39	34.06
2	Insolation, time with unit active power generation	4.54	13.15	4.37	4.70	3.47	15.50	19.11	3.90	10.70	4.58	3.82	9.27	7.36	19.11	8.83	34.06
3	Insolation, time, panel temp.	8.65	22.01	9.36	5.11	4.66	20.50	13.88	2.93	11.47	8.25	2.70	8.30	8.68	13.88	10.03	50.75
3	Insolation, time, panel temp. with unit active power generation	2.12	10.37	2.69	3.65	3.02	6.05	10.30	1.95	5.00	4.81	0.95	5.96	3.23	9.03	4.94	50.75
4	Insolation, time, wind speed	8.50	21.58	19.05	7.42	7.19	14.23	15.55	6.26	17.63	11.35	6.42	6.38	9.03	15.55	11.87	57.79
4	Insolation, time, wind speed with unit active power generation	6.11	15.36	3.99	3.68	2.86	5.58	5.11	1.92	7.80	4.47	2.06	3.05	3.10	5.11	5.01	57.79
5	Insolation, time, air temp.	6.87	13.87	6.69	2.75	4.16	23.73	11.33	3.00	12.62	7.83	2.86	6.17	5.11	11.33	8.45	23.20
5	Insolation, time, air temp with unit active power generation	3.53	14.81	5.96	1.63	3.41	21.90	9.18	2.98	7.80	5.84	1.54	4.36	2.71	5.25	6.49	23.20
6	All data without unit active power generation	0.80	4.20	2.08	0.52	1.03	4.52	1.66	0.57	1.47	0.62	1.00	1.32	1.03	0.98	1.56	16.03
6	All data with unit active power generation	1.00	2.32	2.16	0.68	2.05	5.44	1.50	0.53	1.22	0.29	0.15	0.34	0.39	0.33	1.31	16.03

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sayenko, Y.; Pawelek, R.; Baranenko, T.; Liubartsev, V. The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks. Energies 2025, 18, 2309. https://doi.org/10.3390/en18092309

AMA Style

Sayenko Y, Pawelek R, Baranenko T, Liubartsev V. The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks. Energies. 2025; 18(9):2309. https://doi.org/10.3390/en18092309

Chicago/Turabian Style

Sayenko, Yuriy, Ryszard Pawelek, Tetiana Baranenko, and Vadym Liubartsev. 2025. "The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks" Energies 18, no. 9: 2309. https://doi.org/10.3390/en18092309

APA Style

Sayenko, Y., Pawelek, R., Baranenko, T., & Liubartsev, V. (2025). The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks. Energies, 18(9), 2309. https://doi.org/10.3390/en18092309

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Impact of Meteorological Data on the Accuracy of Solar Electricity Generation Forecasting Using Neural Networks

Abstract

1. Introduction

2. Research Objectives and Analysis Method

2.1. Method Description

2.2. Measurement Data

3. Research Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI