Hourly Photovoltaic Production Prediction Using Numerical Weather Data and Neural Networks for Solar Energy Decision Support

: The day-ahead photovoltaic electricity forecast is increasingly necessary for grid operators and for energy communities. In the present work, the hourly PV production is estimated using two models based on feedforward neural networks (FFNNs). Most existing models use solar radiation as an input. Instead, the models proposed here use numerical weather prediction (NWP) data: ambient temperature, relative humidity, and wind speed, which are easily accessible to anyone. The first proposed model uses multiple inputs, while the second one uses only the necessary information. A sensitivity analysis allows for the identification of the variables that are most influential on the estimation accuracy. This study concludes that the hourly temperature trend is the most important variable for prediction. The models’ accuracy was tested using experimental and NWP data, with the second model having almost the same accuracy as the first despite using fewer input data. The results obtained using experimental data as inputs show a coefficient of determination (R 2 ) of 0.95 for the hourly PV energy produced. The RMSE is about 6.4% of the panel peak power. When NWP data are used as inputs, R 2 is 0.879 and the RMSE is 10.5%. These models can have a significant impact by enabling individual energy communities to make their forecasts, resulting in energy savings and increased self-consumed energy.


Introduction
Today, photovoltaics stands as one of the most crucial technologies for achieving green energy production goals and advancing towards a sustainable future.However, the inherent variability of solar energy poses a significant challenge when it comes to planning energy consumption.Predicting energy production from photovoltaics for the following day is a critical task that offers two substantial advantages in effective energy management: -Consumer and energy communities can optimize their electricity usage based on forecasted energy availability.For instance, a consumer can strategically time the operation of appliances to reduce reliance on the grid, taking into account dynamic energy prices [1,2].Various control systems are instrumental in scheduling electrical loads to ensure that they stay within the installed power capacity, and this coordination is significantly enhanced when integrated with a production forecasting system.-Photovoltaic electricity production forecasting aids grid operators in planning energy distribution.The erratic nature of energy generated from renewable sources poses a challenge to maintaining grid frequency stability.Having prior knowledge of these fluctuations is increasingly crucial, especially as renewables are expected to contribute a larger share of the energy supply in the near future.The prediction of energy generation can be categorized into short-term and longterm forecasts, with the prediction for the next day falling into the former category.These forecasting models are grouped based on the methodologies that they employ, with the most prominent categories being statistical, physical, and artificial intelligence (AI) methods.
The statistical approach involves seeking mathematical formulations that establish connections between input variables and electricity production.A widely used method within this category is the autoregressive moving average (ARMA) [3].Wang et al. [4], on the other hand, used partial functional linear regression models.Although other techniques have evolved from ARMA [5,6], these approaches often provide less reliable results when addressing sudden changes in solar radiation.The inherent rapid fluctuations are not adequately captured by statistical methods.
Physical models, on the other hand, are based on equations that enable thermal and electrical modeling of photovoltaic panels.Various studies have proposed models capable of predicting both PV energy production and panel temperature [7].These models are often built upon energy balance methods [8,9], which may utilize one- [10] or two-dimensional approaches [11].Additionally, computational fluid dynamics (CFD) models have been introduced [12].However, these methods require real-time access to climatic variables to accurately assess the thermoelectric behavior of PV panels, and such precise data are often unavailable for forecasting purposes, particularly for the following day.
The objective of this study was to employ an artificial intelligence approach to predict photovoltaic production.Artificial intelligence has assumed a pivotal role as a predictive tool in various applications, with a significant focus on solar energy production.Machine learning techniques, particularly diverse neural networks, have garnered extensive attention in the recent literature for forecasting PV production [13][14][15].For instance, Pedro et al. [16] conducted a comparative analysis of several forecasting methods to evaluate their accuracy in predicting solar power output from a 1 MWp, single-axis tracking photovoltaic power plant in California.Their findings concluded that artificial neural network (ANN) models outperformed other methods.
In the context of monthly solar power output forecasting, a method employing seasonal decomposition and least-squares support-vector regression has been proposed [17].An ANN has been integrated with data processing, input variable selection, and external optimization techniques to forecast PV system power output [18].Furthermore, an ANN has been indirectly utilized for PV power prediction through solar irradiance forecasting [19].A multilayer perceptron (MLP) model was suggested to forecast 24 h solar irradiance based on daily solar irradiance and air temperature data from an experimental database.The study included a practical application comparing the actual power output from a rooftop PV plant in the Municipality of Trieste with the power calculated using 24-h-ahead solar irradiance forecasts.
In the Republic of Korea, an ANN was employed to model urban energy supply plants and renewable energy availability, integrating energy-related legal regulations, standards, and energy plant facilities into an energy geographic information system database [20].Forecasting power generation 24 h in advance using a radial basis function network (RBFN) was proposed in [21].This technique directly forecasts PV systems' power output using historical records and real-time meteorological data.A recurrent neural network was introduced to predict PV power in a peak zone without relying on future meteorological forecasts, solely using PV power outputs and morning meteorological observations [22].In another study, a seven-parameter electrical model and a feedforward neural network were cascaded to test multicrystalline PV panels' performance, achieving mean bias error deviations of less than ±1% [23].A recurrent neural network model with long short-term memory was developed to recognize temporal patterns in data collected from 164 PV sites over 63 months, including weather conditions and estimated solar irradiation.The model achieved a normalized root-mean-square error of 7.416% and a mean absolute percentage error of 10.8% [24].Almonacid et al. [25] proposed a methodology for forecasting PV output one hour ahead using a dynamic artificial neural network.This approach employed Energies 2024, 17, 466 3 of 22 two ANNs for predicting weather variables (solar irradiance and air temperature) and a third ANN to estimate the output power of a PV module.A fourth ANN incorporated the output of the preceding ANN and the PV configuration to provide final forecast values.Additionally, a combination of a linear regression model and an ANN was utilized to predict the performance of soiled PV modules using solar irradiation and ambient temperature [26].Notably, an ANN has also been applied to suggest an active cooling algorithm based on fan cooling for the back surface of PV panels [27].At the University of Malaya, an extreme learning machine (ELM) algorithm was developed to forecast the maximum power point tracking (MPPT) of three grid-connected PV plants, considering forecasting horizons of 1 h and 1 day ahead [28].
The majority of the models mentioned in the previous discussion utilized solar radiation as their primary input data and achieved highly satisfactory results.However, in practice, obtaining accurate solar radiation information as an input in advance using forecast models can be challenging.On the other hand, numerous studies focus on predicting solar radiation and subsequently employ a physical model to estimate photovoltaic output.Undoubtedly, solar irradiance plays a pivotal role in determining the electrical performance of PV panels.Nevertheless, the electricity production is not solely defined by solar irradiance.The conversion efficiency also relies on the cell temperature, which, in turn, is influenced by various boundary conditions.A comprehensive analysis necessitates that the neural network directly provides the electrical output, allowing it to factor in the panel's conversion efficiency.
In this study, two models are proposed, both based on artificial neural network (ANN) technology, with the goal of predicting the power output of a silicon photovoltaic module.In a departure from the existing literature, these models predict the PV module's performance without relying on solar radiation data as inputs.The aim is to perform the forecast using the numerical weather prediction (NWP) data that are easily accessible through websites.Specifically, this relies on hourly temperature, relative humidity, and wind speed data.These models have the potential to empower individual energy communities to create their own forecasts.This approach holds significant promise, as it only requires standard meteorological data for each location.Furthermore, this work offers insights into determining the optimal number of neurons for the neural network architecture and ranks the most critical information for short-term forecasting.This information can serve as a foundational reference for future research endeavors aimed at exploring this problem further.
The remainder of this paper is structured as follows: Section 2 introduces the applied methodology, presenting two distinct models, both starting from the hourly values of three selected quantities.The first model incorporates numerous additional inputs derived from daily processing of these hourly values, while the second model relies solely on the essential information.In Section 3, the study's outcomes are revealed.This section outlines the defined network architectures and presents the results of tests conducted using both experimental data and NWP data as inputs.Additionally, it includes a sensitivity analysis aimed at identifying the most critical variables in the models.

The Proposed ANN Models
The proposed ANN models were developed with the objective of predicting the hourly electricity production of a photovoltaic panel.Figure 1 provides a summary diagram of the investigation.The selected inputs should be readily obtainable from numerical weather forecasts; thus, the models rely on three crucial variables: hourly air temperature, relative humidity, and wind speed.The initial step involves configuring how these inputs are fed into the ANN.Data preprocessing is performed on a daily scale to generate additional input variables.Two models are introduced here for this purpose.The first model, referred to as Model1, incorporates all of the selected inputs, encompassing both hourly and daily data.Training and validation were conducted using experimental data collected within the laboratory of the Department of Mechanical Engineering, Energy, and Management.These Energies 2024, 17, 466 4 of 22 phases aimed to identify the optimal neural network architecture and assess the quality of the prediction results.The other model, designated as Model2, explores the use of a reduced number of inputs.Model2 is derived from Model1 by systematically excluding one input variable at a time, allowing for an evaluation of the importance of each input variable.Similar to Model1, training and validation were carried out to determine the most effective network architecture.Subsequently, both networks underwent testing using experimental data collected during different seasons, including summer, spring, and winter.To assess the networks' stability and their ability to cope with potential errors associated with each input variable, a sensitivity analysis was conducted.Finally, a comprehensive evaluation was undertaken by utilizing NWP data as inputs for the models.
inputs are fed into the ANN.Data preprocessing is performed on a daily scale to generate additional input variables.Two models are introduced here for this purpose.The first model, referred to as Model1, incorporates all of the selected inputs, encompassing both hourly and daily data.Training and validation were conducted using experimental data collected within the laboratory of the Department of Mechanical Engineering, Energy, and Management.These phases aimed to identify the optimal neural network architecture and assess the quality of the prediction results.The other model, designated as Model2, explores the use of a reduced number of inputs.Model2 is derived from Model1 by systematically excluding one input variable at a time, allowing for an evaluation of the importance of each input variable.Similar to Model1, training and validation were carried out to determine the most effective network architecture.Subsequently, both networks underwent testing using experimental data collected during different seasons, including summer, spring, and winter.To assess the networks' stability and their ability to cope with potential errors associated with each input variable, a sensitivity analysis was conducted.Finally, a comprehensive evaluation was undertaken by utilizing NWP data as inputs for the models.

Artificial Neural Network
The artificial neuron is the basic element of a neural network.It functions in a similar way to a biological neuron, which generates an electrical impulse that propagates along the axon (i.e., the output of the neuron) only if the electrical potential of the neuron exceeds a certain threshold.Similarly, the artificial neuron analyses the intensity of each input, comparing it with a reference value (bias), and provides the output using an activation function.The data are then multiplied by a weight and reach another neuron as inputs.In mathematical terms, the output  of neuron k can be modeled with the following expression: where  represents the activation function,  is the bias of the neuron,  is the number of inputs to the neuron,  is the input, and  is the weight assigned to each input through the synaptic connections.In this study, the activation function used is the sigmoid function:

Artificial Neural Network
The artificial neuron is the basic element of a neural network.It functions in a similar way to a biological neuron, which generates an electrical impulse that propagates along the axon (i.e., the output of the neuron) only if the electrical potential of the neuron exceeds a certain threshold.Similarly, the artificial neuron analyses the intensity of each input, comparing it with a reference value (bias), and provides the output using an activation function.The data are then multiplied by a weight and reach another neuron as inputs.In mathematical terms, the output o k of neuron k can be modeled with the following expression: where φ represents the activation function, b k is the bias of the neuron, n is the number of inputs to the neuron, x j is the input, and w kj is the weight assigned to each input through the synaptic connections.In this study, the activation function used is the sigmoid function: This trigger function is often used in typical ANN applications and allows nonlinearity to be introduced into the overall input-output link.The architecture of a feedforward neural network (FFNN) consists of numerous neurons arranged in input layers, hidden layers, and output layers.Each neuron of a layer is interconnected with all of the neurons of the next layer.The designer of a network has the task of identifying the numbers of hidden layers and neurons.Most scientific articles suggest a trial-and-error approach, and some articles suggest starting values for various attempts.The solution with only one hidden layer makes it possible to solve most problems with high accuracy.For the number of neurons to be used, Boger et al. [29] suggest starting with 70−90% of the number of input neurons.

Selection of Input Data
The input variables are temperature, relative humidity, and wind speed, provided on an hourly basis by the websites for the following day.To infer the predictability of the day ahead, the neural network is thus equipped with 24 values for each variable.
These input variables are able to provide valuable insights for predicting the electricity production.For instance, relative humidity tends to be higher on rainy or overcast days, while air temperature tends to be higher on sunny days.Wind speed plays a role in the heat transfer of the photovoltaic panel and affects its efficiency.Furthermore, it acts as an indicator that encapsulates information about the variations in atmospheric conditions due to pressure gradients, which can swiftly alter sky cover.All of these aspects are indicative of solar irradiance and also impact the cell temperature, a parameter with a direct influence on the conversion efficiency of the photovoltaic module.Therefore, these inputs all have a direct or indirect influence on electrical predictability.Additional essential inputs include the same data processed on a daily basis.For instance, information regarding the minimum and maximum daily temperatures provides the ANN with a fixed reference point for hourly temperature values.Moreover, the largest daily temperature range typically signifies a clear day.The minimum and maximum relative humidity values are also critical; when the minimum relative humidity is close to 100%, it often indicates a high likelihood of rainy, overcast, or foggy conditions throughout the day.All of the selected input data that could influence PV electricity production are outlined in Table 1.The initial eight variables represent daily data and aid in categorizing the overall type of day, whether it is clear, cloudy, partly cloudy, etc.The remaining four variables are measured hourly and assist in understanding how electricity production is distributed over time.All of these data, in conjunction with the electrical output of a photovoltaic panel, are collected using an experimental setup and employed for training the neural network.

Model1
The first model, referred to as Model1, incorporates all of the selected inputs.The structure of Model1 is depicted in Figure 2, composed of two neural networks (ANN1 and ANN2).ANN1 processes daily data, while ANN2 deals with hourly data.The final output is the hourly PV energy produced (HPE), which is computed using the ANN2 network.This network also utilizes the daily PV energy production (DPE) as input information.This is particularly useful in estimating hourly energy production, as ANN2 can distribute the daily energy across timeslots with the aid of hourly temperature, relative humidity, and wind speed profiles.The DPE is estimated by ANN1, which exclusively operates with daily data.Specifically, ANN1 relies on the following data: DoY, RH min , RH max , T min , Energies 2024, 17, 466 6 of 22 T max , T avg , RH avg , and ws avg .On the other hand, ANN2, in addition to using DPE and the hourly data, also includes the minimum and maximum temperature and the minimum and maximum relative humidity.Finally, HPE represents the energy production for the current hour.To provide ANN2 with additional information, data from not only the current timestep but also the preceding and subsequent timesteps are included as inputs.This additional information helps the network account for potential fluctuations in production relative to neighboring timesteps.

ANN2
). ANN1 processes daily data, while ANN2 deals with hourly data.The final output is the hourly PV energy produced (), which is computed using the ANN2 network.This network also utilizes the daily PV energy production () as input information.This is particularly useful in estimating hourly energy production, as ANN2 can distribute the daily energy across timeslots with the aid of hourly temperature, relative humidity, and wind speed profiles.The  is estimated by ANN1, which exclusively operates with daily data.Specifically, ANN1 relies on the following data:  ,  ,  ,  ,  ,  ,  , and  .On the other hand, ANN2, in addition to using  and the hourly data, also includes the minimum and maximum temperature and the minimum and maximum relative humidity.Finally,  represents the energy production for the current hour.To provide ANN2 with additional information, data from not only the current timestep but also the preceding and subsequent timesteps are included as inputs.This additional information helps the network account for potential fluctuations in production relative to neighboring timesteps.

ANN Training Procedure
The artificial neural networks were trained to understand the connection between atmospheric variables and the hourly electrical energy production.To achieve this, the networks underwent training using a substantial volume of experimental data collected at the University of Calabria (Latitude: 39°21' N; Longitude: 16°13' E).The experimental setup was positioned on the rooftop of a building within the Department of Mechanical, Energy, and Management Engineering at the University of Calabria.This setup consisted of a south-oriented PV module affixed to a metallic structure, inclined at 30°.The PV module was constructed with polycrystalline silicon cells, measuring 1663 mm × 998 mm, with a total area of 1.46 m 2 .It boasted a nominal efficiency of 14.5%, a nominal power output of 245 W, and a NOCT of 43 °C.The DC/AC conversion was facilitated by a micro-inverter

ANN Training Procedure
The artificial neural networks were trained to understand the connection between atmospheric variables and the hourly electrical energy production.To achieve this, the networks underwent training using a substantial volume of experimental data collected at the University of Calabria (Latitude: 39 • 21 ′ N; Longitude: 16 • 13 ′ E).The experimental setup was positioned on the rooftop of a building within the Department of Mechanical, Energy, and Management Engineering at the University of Calabria.This setup consisted of a south-oriented PV module affixed to a metallic structure, inclined at 30 • .The PV module was constructed with polycrystalline silicon cells, measuring 1663 mm × 998 mm, with a total area of 1.46 m 2 .It boasted a nominal efficiency of 14.5%, a nominal power output of 245 W, and a NOCT of 43 • C. The DC/AC conversion was facilitated by a micro-inverter equipped with maximum power point tracker.Meteorological and climatic conditions were closely monitored by an integrated weather station, with the sensor specifications provided in Table 2.The data were collected over the years 2017, 2018, and 2019, with a one-minute timestep.However, since such a level of granularity is not necessary, the data were processed by computing hourly and daily averages.Data recorded in August, October, and December 2018 served as the validation set, while data from January, April, and July 2019 constituted the test set.The remaining data were employed for training.In total, the model had access to 867 days of data for learning the potential relationships between the hourly output and the input variables.A subset of 93 days was allocated for validation, and another 92 days was designated for testing.The training was conducted using MATLAB R2019a software, which randomly further divided the 867 days into training, validation, and test datasets, with percentages of 70%, 15%, and 15%, respectively.The chosen learning method was supervised backpropagation.This method involves adjusting the neural network's weights in a backward manner, guided by the difference between the obtained value and the desired value.The goal is to progressively reduce the root-mean-square error (RMSE) calculated using the training dataset.Training halts when the RMSE, computed with MATLAB's validation dataset, ceases to decrease for six consecutive epochs-a technique known as cross-validation.This approach helps mitigate the risk of overfitting.Additionally, apart from the RMSE, other statistical indices, such as normalized errors (NRMSE, NMAE, and NMBE), are closely monitored.These normalized errors are defined by the following equations, where x f and x o represent the forecasted and observed values, respectively:

Reducing the Number of Variables to Define Model2
Model1 employs numerous input variables, and some of them carry redundant information, while others are not particularly valuable for forecasting.Consequently, a new model is introduced here, which operates exclusively with the essential variables.To discern which variables have the greatest influence on prediction, Model1 was retrained iteratively by excluding one input at a time.The input data were then ranked based on their impact on the accuracy of prediction, as determined by the RMSE calculated using the validation dataset.In addition to assessing the importance of variables for prediction accuracy, the Pearson correlation coefficient was employed to examine the interrelationships between the variables.This coefficient was calculated using the following formula: x i is the average of x and y m = 1 N ∑ N i=1 y i is the average of y.If the coefficient is equal to zero, it suggests that the data are not linearly correlated.A coefficient greater than one indicates a positive linear correlation, while a negative coefficient signifies an inverse correlation.This coefficient is calculated to assess the relationships between all possible input pairs, as well as between the inputs and the output variable.The importance ranking and Pearson's coefficient aid in the selection of the essential variables needed to create Model2.Like Model1, a thorough investigation was conducted to determine the most effective architecture for the neural network in Model2.

Tests and Sensitivity Analysis
Regarding both of the defined models, two tests were conducted to evaluate their real-world accuracy.The first test employed experimental data as inputs, gathered over 92 days (specifically, in January, April, and July 2019).Quantitative accuracy assessments were carried out using error indices such as NRMSE, NMBE, and NMAE.Additionally, a sensitivity analysis was carried out to explore the impact on performance resulting from errors in input variables.This analysis is critical because neural networks must be capable of functioning with NWP data as inputs, which can be subject to errors.The sensitivity analysis was performed using the same dataset, and it introduced perturbations in the hourly trends of the input variables to monitor the corresponding increase in RMSE calculated between the network's output and the target.Three types of errors were considered: (1) A Gaussian error on hourly values, where the variation from the actual value is defined randomly, following a probability density defined by the standard Gaussian curve.(2) An offset error that uniformly increases all hourly values by the same amount.
(3) An offset error that uniformly reduces all hourly values by the same amount.
The systematic errors introduced in points 2 and 3 are based on the standard deviations of the input variables.Values obtained with the introduction of these errors are subsequently processed to correct situations that cannot occur, such as relative humidity exceeding 100% or wind speed dropping below zero.
The final test involved the use of NWP data for the next day as inputs.Weather forecasts were obtained from websites such as weather.com[30] and ilmeteo.it[31].These forecasts were acquired between 5 March and 18 March 2022.The results of this test can also be influenced by errors in the weather forecast models.Figure 3 illustrates the predicted temperature and relative humidity obtained from both websites.The temperature forecast from weather.com exhibits a smaller daily temperature range compared to the actual data.On the other hand, the temperature forecast from ilmeteo.it mirrors the actual temperature range but often underestimates the actual temperature, particularly during nighttime.Similar observations can be made concerning relative humidity, where the actual data exhibit greater variation than what is provided by the forecast websites.Notably, values of 100% are recorded in the actual data, while such values are seldom seen in the forecasts.

Model1 Architecture
The initial step in designing a neural network involves selecting the optimal architecture, including determining the numbers of hidden layers and neurons.It is important to note that, in neural networks, an increase in the number of nodes does not necessarily guarantee improved results (unlike situations where mesh densification is employed in fields such as mechanics and fluid dynamics).The only approach is to experiment with

Model1 Architecture
The initial step in designing a neural network involves selecting the optimal architecture, including determining the numbers of hidden layers and neurons.It is important to note that, in neural networks, an increase in the number of nodes does not necessarily guarantee improved results (unlike situations where mesh densification is employed in fields such as mechanics and fluid dynamics).The only approach is to experiment with different configurations.To determine the best architecture for the networks, numerous trials were conducted by varying the number of nodes per layer from 3 to 35 and the number of hidden layers from 1 to 2. Each network underwent five separate training sessions with different sets of random initial weights.Out of these five training sessions, only the network that yielded the lowest RMSE on the validation dataset was retained.Table 3 displays the results for the ANN1 architectures, with statistical indices normalized against the maximum value x o,max of 1713 Wh from the validation dataset.The choice of the most suitable configuration is based on the lowest NRMSE, which corresponds to the network with one hidden layer and only three neurons.This same network also demonstrates strong performance when compared to the others in terms of all of the statistical indices.The negative NMBE suggests a slight underestimation in the results.It is noteworthy that increasing the number of neurons tends to diminish the prediction performance.Table 4 presents the results regarding the selection of the architecture for ANN2.In this case, the maximum hourly value x o,max is 241 Wh.The optimal network features one hidden layer and ten nodes.This configuration exhibits the most favorable behavior, including the NMAE and R 2 metrics, and generally underestimates the output result by approximately 1 Wh on average.

Reducing the Number of Variables
A more in-depth analysis is required to ascertain which variables are essential for the model.Figure 4 illustrates the distribution of data points and Pearson coefficients between all pairs of input variables within the complete dataset.This visualization helps identify connections between input variables, even those with nonlinear correlations that may not be evident through Pearson's coefficient.Variables exhibiting strong mutual correlations provide redundant information, allowing for the removal of one of them.Conversely, a valuable correlation with DPE (the output for ANN1) is of significance.Key observations from this analysis include the following: -Day of year (DoY): Although not linearly related to any quantity, the distributions of data points in relation to daily temperatures suggest a connection between these variables.It appears that DoY could be deduced from air temperature data, and, to some extent, relative humidity is also influenced by DoY.However, daily production (DPE) is not directly linked to DoY, as it can reach high values in all months, with lower values typically observed in the winter.-Temperatures (T min , T max , and T avg ): These temperature variables are highly interrelated.Notably, T avg is the least influential variable among them, with Pearson coefficients of 0.97 when compared to T min and T max .Both T min and T max also exhibit a correlation with one another.However, their difference, which represents daily temperature fluctuations, can provide valuable information related to average cloudiness.Among the three temperatures, cannot be substituted by other inputs.Notably, HoD exhibits a near-zero Pearson coefficient with almost all other inputs, except for its low correlations with hourly data for air temperature, relative humidity, and wind speed.Specifically, the point distributions reveal that air temperature and wind speed tend to peak in the early afternoon, while relative humidity decreases.-Hourly data: Regarding the hourly data, a negative correlation is observed between temperature (T) and relative humidity (RH), with r xy equal to −0.73.
relates well with several other known variables, providing similar information.The maximum relative humidity ( ) often saturates at 100%, which could be used by ANNs to gauge the level of cloudiness.
-Average wind speed ( ) and hourly wind speed (): These wind speed variables do not display strong correlations with any other inputs.-Hour of day (): This proves indispensable, as it demonstrates minimal correlation with other variables.This means that the unique information that it provides to ANN2 cannot be substituted by other inputs.Notably,  exhibits a near-zero Pearson coefficient with almost all other inputs, except for its low correlations with hourly data for air temperature, relative humidity, and wind speed.Specifically, the point distributions reveal that air temperature and wind speed tend to peak in the early afternoon, while relative humidity decreases.-Hourly data: Regarding the hourly data, a negative correlation is observed between temperature () and relative humidity (), with  equal to −0.73.  Figure 5 presents the correlation index between the output variable, which is the hourly energy produced by photovoltaics (HPE), and all input parameters.In this context, high correlations indicate the usefulness of the input in predicting the output.Notable linear correlations are primarily observed with hourly air temperature and hourly relative humidity (0.47 and −0.58, respectively).There is a lower correlation with hourly wind speed (0.26).Daily variables do not significantly influence the hourly PV energy production.The graph depicting RH min indicates that higher values, such as those close to 100%, correspond to lower energy production.Additionally, DPE exhibits a slight correlation with HPE; when daily energy production is low, hourly energy production is also low.As expected, HPE is uniformly distributed concerning the hour of day (HoD).While the relationship is not strictly linear, HoD plays a crucial role in ensuring that the network's output aligns with the actual target.
tion.The graph depicting  indicates that higher values, such as those close to 100%, correspond to lower energy production.Additionally,  exhibits a slight correlation with ; when daily energy production is low, hourly energy production is also low.As expected,  is uniformly distributed concerning the hour of day ().While the relationship is not strictly linear,  plays a crucial role in ensuring that the network's output aligns with the actual target.To determine which variables could be eliminated, Model1 was retrained by systematically excluding one input at a time.This methodology allowed us to evaluate the significance of each variable in influencing the output.Changes in model performance were assessed by monitoring the RMSE on  with the validation dataset.Figure 6 illustrates that the RMSE increases when a variable is eliminated compared to the full model.The omission of  carries substantial weight, resulting in a roughly 50% increase in RMSE compared to the full model.Conversely, the information provided by  proves to be redundant.The networks can comprehend this information by using the average, minimum, and maximum daily temperatures.Indeed, Figure 4 illustrates a strong correlation between these temperatures and the day of the year.
Furthermore, Figure 6 demonstrates that all daily mean values, denoted by the subscript av, are dispensable.Similarly, the maximum relative humidity does not contribute valuable information.In contrast, the minimum relative humidity, as previously noted, holds more significance than other daily humidity data, exhibiting a stronger correlation with the .This is supported by its higher ranking compared to others. , along with , holds a mid-ranking position, indicating their nearly equal influence, possibly due to the similarity in the information that they provide.Since the absence of both variables leads to a moderate increase in RMSE, it is reasonable to eliminate them, as they do not significantly contribute to the network.At the top of the importance ranking, the hourly quantities are present, along with the two daily minimum and maximum temperatures.To determine which variables could be eliminated, Model1 was retrained by systematically excluding one input at a time.This methodology allowed us to evaluate the significance of each variable in influencing the output.Changes in model performance were assessed by monitoring the RMSE on HPE with the validation dataset.Figure 6 illustrates that the RMSE increases when a variable is eliminated compared to the full model.The omission of HoD carries substantial weight, resulting in a roughly 50% increase in RMSE compared to the full model.Conversely, the information provided by DoY proves to be redundant.The networks can comprehend this information by using the average, minimum, and maximum daily temperatures.Indeed, Figure 4 illustrates a strong correlation between these temperatures and the day of the year.

Model2 Design and Architecture
The analysis of the input variables' importance led to the development of Model2, which exhibits a simplified structure (depicted in Figure 7) compared to Model1.The excluded variables in Model2 are  ,  ,  ,  ,  , , and  .The criteria for elimination were primarily based on the ranking presented in Figure 5, with one exception.The  was retained, as it is not subject to prediction error.Its significance in Model1 is minimal due to redundancy with the information provided by the three daily temperatures, as demonstrated in Figure 4.The elimination of some of these temperatures, which are susceptible to forecast errors, could restore importance to .Specifically,  was omitted, given that the simultaneous presence of ,  , and  would offer redundant information.Although these three variables are interconnected, the network requires dual information: a daily temperature value for referencing hourly temperature values, and knowledge of the maximum daily fluctuation for characterizing sky coverage.Consequently,  was retained, while  was removed due to its lower ranking compared to  .Since  was eliminated, there was no need to intro- Furthermore, Figure 6 demonstrates that all daily mean values, denoted by the subscript av, are dispensable.Similarly, the maximum relative humidity does not contribute valuable information.In contrast, the minimum relative humidity, as previously noted, holds more significance than other daily humidity data, exhibiting a stronger correlation with the DPE.This is supported by its higher ranking compared to others.RH min , along with DPE, holds a mid-ranking position, indicating their nearly equal influence, possibly due to the similarity in the information that they provide.Since the absence of both variables leads to a moderate increase in RMSE, it is reasonable to eliminate them, as they do not significantly contribute to the network.At the top of the importance ranking, the hourly quantities are present, along with the two daily minimum and maximum temperatures.

Model2 Design and Architecture
The analysis of the input variables' importance led to the development of Model2, which exhibits a simplified structure (depicted in Figure 7) compared to Model1.The excluded variables in Model2 are RH avg , ws avg , RH max , T avg , RH min , DPE, and T max .The criteria for elimination were primarily based on the ranking presented in Figure 5, with one exception.The DoY was retained, as it is not subject to prediction error.Its significance in Model1 is minimal due to redundancy with the information provided by the three daily temperatures, as demonstrated in Figure 4.The elimination of some of these temperatures, which are susceptible to forecast errors, could restore importance to DoY.Specifically, T max was omitted, given that the simultaneous presence of DoY, T min , and T max would offer redundant information.Although these three variables are interconnected, the network requires dual information: a daily temperature value for referencing hourly temperature values, and knowledge of the maximum daily fluctuation for characterizing sky coverage.Consequently, DoY was retained, while T max was removed due to its lower ranking compared to T min .Since DPE was eliminated, there was no need to introduce the first neural network present in Model1.The structure of Model2 is consequently lighter and reduced to a minimum.In analogy to the previous procedure with Model1, Figure 8 illustrates the rank the most important variables when the network is retrained with the omission of ce The neural network in Model2 is denoted as ANN3, featuring 10 input nodes and 1 output node.Similar to Model1, the architecture was determined through a trial-and-error process.Various configurations were tested, including those with one and two hidden layers, with the number of nodes in each layer ranging from 3 to 35.The results obtained are summarized in Table 5.The optimal network was identified as having one hidden layer with 10 nodes.On the validation dataset, this configuration achieved an NRMSE of 6.00%.Networks with two hidden layers exhibited inferior performance.The best configuration ultimately yielded an NMAE of about 3.4%, an NMBE of −0.4%, and an R 2 regression index of 0.949, indicating a strong correlation between the output and target values.
In analogy to the previous procedure with Model1, Figure 8 illustrates the ranking of the most important variables when the network is retrained with the omission of certain inputs.It is noteworthy that further elimination of some variables leads to a deterioration in results.Relative humidity gains increased significance compared to Model1, since information about the daily maximum and minimum limits of the same variable is no longer available.The minimum temperature loses positions in the ranking, and the least useful variable continues to be the day of the year.This underscores the importance of retaining certain variables to preserve the network's performance and highlights the role of relative humidity in the absence of specific temperature data.

Testing and Sensitivity Analysis
The accuracy of the models must be assessed using the test dataset, and Table 6 presents the calculated statistical indices for both models.The results pertain to .The NRMSEs are comparable to those obtained with the validation dataset, indicating the avoidance of the overfitting phenomenon.9 illustrates the estimated daily electrical energy ( ) produced by both models, comparing them with experimental data over the course of one year (2017).The models closely track the daily electrical energy, effectively capturing the nature of each day and providing reliable production estimates.During summer, challenges arise on cloudy days, while on winter days both models adeptly align with the experimental data.

Testing and Sensitivity Analysis
The accuracy of the models must be assessed using the test dataset, and Table 6 presents the calculated statistical indices for both models.The results pertain to HPE.The NRMSEs are comparable to those obtained with the validation dataset, indicating the avoidance of the overfitting phenomenon.

Daily Electricity Forecast
Figure 9 illustrates the estimated daily electrical energy (DPE) produced by both models, comparing them with experimental data over the course of one year (2017).The models closely track the daily electrical energy, effectively capturing the nature of each day and providing reliable production estimates.During summer, challenges arise on cloudy days, while on winter days both models adeptly align with the experimental data.
Energies 2024, 17, x FOR PEER REVIEW 16 of 22 models demonstrate precision in identifying electricity production on both clear and overcast days, while encountering some difficulties on partly cloudy days.Despite these challenges and the utilization of only temperature, relative humidity, and wind speed data, the results can still be deemed highly satisfactory.

Hourly Electricity Forecast
The models generate outputs in the form of hourly electrical energy produced by the photovoltaic panel, as depicted in Figure 11 using results from the test dataset.For visualization purposes, five representative days are displayed for each of the three months, illustrating different sky conditions.On 1 January, a notable day, high electricity production was observed in the morning, before sharply declining to almost zero in the afternoon.Both models effectively predicted the morning production but struggled to accurately forecast the afternoon electricity values.The two cloudy days of the 2 January and In January, the daily photovoltaic (PV) electrical energy production from the test dataset was lower compared to other months.Despite the high variability of the atmospheric data, the models were able to follow the experimental trend with very good accuracy.In the months of April and July, the models performed well; however, in these months they faced more difficulties than in the winter ones, especially on cloudy days.In particular, on 9 April and 10 July, the models underestimated the electricity production.However, they correctly predicted the reduction in electricity on 29 July.
Figure 10 illustrates the distribution of predicted daily photovoltaic electrical energy in comparison to the observed data.Model1 exhibits an RMSE of approximately 128 Wh, with an R 2 regression index of 0.914.On average, the data are underestimated by about 23 Wh.Model2 shows a slightly higher RMSE, at 135 Wh, with an R 2 of 0.902 and an MBE of approximately −17.7 Wh.The latter is lower in absolute terms than that obtained with Model1.In both cases, the slope of the regression line is slightly lower than that of the quadrant bisector, and the intercepts of the lines are slightly higher than the origin of the axes.The most significant errors occur at points with intermediate magnitudes.The models demonstrate precision in identifying electricity production on both clear and overcast days, while encountering some difficulties on partly cloudy days.Despite these challenges and the utilization of only temperature, relative humidity, and wind speed data, the results can still be deemed highly satisfactory.

Hourly Electricity Forecast
The models generate outputs in the form of hourly electrical energy produced by the photovoltaic panel, as depicted in Figure 11 using results from the test dataset.For visualization purposes, five representative days are displayed for each of the three months, illustrating different sky conditions.On 1 January, a notable day, high electricity production was observed in the morning, before sharply declining to almost zero in the afternoon.Both models effectively predicted the morning production but struggled to accurately forecast the afternoon electricity values.The two cloudy days of the 2 January and

Hourly Electricity Forecast
The models generate outputs in the form of hourly electrical energy produced by the photovoltaic panel, as depicted in Figure 11 using results from the test dataset.For visualization purposes, five representative days are displayed for each of the three months, illustrating different sky conditions.On 1 January, a notable day, high electricity production was observed in the morning, before sharply declining to almost zero in the afternoon.Both models effectively predicted the morning production but struggled to accurately forecast the afternoon electricity values.The two cloudy days of the 2 January and 5 January were accurately predicted.The 3 January was a clear day, and this was also correctly assessed.However, the models exhibited imperfections on 4 January, failing to predict energy production during peak hours.This scenario aligns with days that do not fit squarely into the clear or cloudy categories.In April and July, electricity production was higher than in January.Notably, attention should be directed to the time trend during this analysis.In all cases, the energy produced was closely followed during morning and afternoon hours, with some disparities in the peak power recordings.The models satisfactorily estimated the unique trends observed on the mornings of 13 April and 16 April and 28-29 July.However, peak power was often underestimated, with an exception on 15 April.Overall, the models successfully identified experimental trends.Figure 12 presents regression lines for both models in relation to hourly data, with points distributed around the bisector of the quadrant.Similar to the findings for daily PV electricity values, forecasting challenges were more prominent for intermediate powers.Nonetheless, the majority of cases were accurately estimated, resulting in RMSE values ranging from 14 to 15 Wh in both instances.The regression index hovered around 0.95. 5 January were accurately predicted.The 3 January was a clear day, and this was also correctly assessed.However, the models exhibited imperfections on 4 January, failing to predict energy production during peak hours.This scenario aligns with days that do not fit squarely into the clear or cloudy categories.In April and July, electricity production was higher than in January.Notably, attention should be directed to the time trend during this analysis.In all cases, the energy produced was closely followed during morning and afternoon hours, with some disparities in the peak power recordings.The models satisfactorily estimated the unique trends observed on the mornings of 13 April and 16 April and 28-29 July.However, peak power was often underestimated, with an exception on 15 April.Overall, the models successfully identified experimental trends.Figure 12 presents regression lines for both models in relation to hourly data, with points distributed around the bisector of the quadrant.Similar to the findings for daily PV electricity values, forecasting challenges were more prominent for intermediate powers.Nonetheless, the majority of cases were accurately estimated, resulting in RMSE values ranging from 14 to 15 Wh in both instances.The regression index hovered around 0.95.

Sensitivity Analysis
The input variables coming from numerical weather prediction (NWP) are susceptible to errors, which can be significant.Prediction models must demonstrate the ability to respond appropriately even when used with inaccurate input data.Notably, the hour of the day and the day of the year are inherently error-free quantities.Therefore, the subsequent analysis focuses on investigating the models' behavior solely in response to errors in temperature, relative humidity, and wind speed.To assess the impact of input errors,

Sensitivity Analysis
The input variables coming from numerical weather prediction (NWP) are susceptible to errors, which can be significant.Prediction models must demonstrate the ability to respond appropriately even when used with inaccurate input data.Notably, the hour of the day and the day of the year are inherently error-free quantities.Therefore, the subsequent analysis focuses on investigating the models' behavior solely in response to errors in temperature, relative humidity, and wind speed.To assess the impact of input errors, perturbations were introduced to the trends of these variables, monitoring the corresponding increase in the RMSE of the output-specifically, the predicted hourly photovoltaic electrical energy (HPE).Three types of errors were considered: (1) random error, (2) an upward offset error, and (3) a downward offset error, each equivalent to the standard deviation calculated from the test dataset.The graphs presented in Figure 13 illustrate the increase in the RMSE of the predicted HPE for the three error cases examined.The Gaussian error introduces modifications to the time trend, eliminating information on gradients with respect to the preceding and succeeding times.Perturbations are distributed around the actual mean values of the quantities, maintaining the overall trend of the variables.The graph indicates that temperature errors significantly impact the final result, with Model1 demonstrating better resilience than Model2.Conversely, errors introduced in relative humidity and wind speed lead to an approximate 5% increase, with Model2 exhibiting better adaptability than Model1.
Energies 2024, 17, x FOR PEER REVIEW 18 of 22 perturbations were introduced to the trends of these variables, monitoring the corresponding increase in the RMSE of the output-specifically, the predicted hourly photovoltaic electrical energy (HPE).Three types of errors were considered: (1) random error, (2) an upward offset error, and (3) a downward offset error, each equivalent to the standard deviation calculated from the test dataset.The graphs presented in Figure 13 illustrate the increase in the RMSE of the predicted HPE for the three error cases examined.The Gaussian error introduces modifications to the time trend, eliminating information on gradients with respect to the preceding and succeeding times.Perturbations are distributed around the actual mean values of the quantities, maintaining the overall trend of the variables.The graph indicates that temperature errors significantly impact the final result, with Model1 demonstrating better resilience than Model2.Conversely, errors introduced in relative humidity and wind speed lead to an approximate 5% increase, with Model2 exhibiting better adaptability than Model1.The Gaussian error modifies the time trend, eliminating the information on the gradients with respect to the previous and next times.The perturbations are distributed around the actual mean values of the quantities.Thus, the overall trend of the variables remains the same.The graph shows that the error in the temperature has a significant effect on the final result.Model1 seems to react better than Model2.The same error introduced to relative humidity and wind speed caused an increase of about 5%.In this case, Model2 seems to react better than Model1.
The models are stable with respect to temperature and wind speed when their values are shifted upwards.However, both models suffer greatly from this type of error associated with relative humidity.On the other hand, in cases where the input variables are reduced by a constant value, the models continue to behave appropriately.The systematic error therefore only affects the models' performance when the relative humidity is increased by a constant value.In fact, the models can use this information to detect overcast or rainy days.Overestimating relative humidity implies that the models perceive the day as cloudier than it actually is.Model1 appears to be more dependent on daily data, such as maximum relative humidity, while Model2, relying primarily on the hourly variations in quantities, exhibits a weaker dependency.The tests were conducted utilizing the atmospheric forecast illustrated in Figure 3, and Figure 14 displays the hourly results.When using weather.comas a data source, the outcomes were satisfactory, except for 5 March, where the predicted production exceeded the actual production.On the cloudy days of 6-7 March, Model2 accurately followed the The Gaussian error modifies the time trend, eliminating the information on the gradients with respect to the previous and next times.The perturbations are distributed around the actual mean values of the quantities.Thus, the overall trend of the variables remains the same.The graph shows that the error in the temperature has a significant effect on the final result.Model1 seems to react better than Model2.The same error introduced to relative humidity and wind speed caused an increase of about 5%.In this case, Model2 seems to react better than Model1.
The models are stable with respect to temperature and wind speed when their values are shifted upwards.However, both models suffer greatly from this type of error associated with relative humidity.On the other hand, in cases where the input variables are reduced by a constant value, the models continue to behave appropriately.The systematic error therefore only affects the models' performance when the relative humidity is increased by a constant value.In fact, the models can use this information to detect overcast or rainy days.Overestimating relative humidity implies that the models perceive the day as cloudier than it actually is.Model1 appears to be more dependent on daily data, such as maximum relative humidity, while Model2, relying primarily on the hourly variations in quantities, exhibits a weaker dependency.

Tests with NWP Data
The tests were conducted utilizing the atmospheric forecast illustrated in Figure 3, and Figure 14 displays the hourly results.When using weather.comas a data source, the outcomes were satisfactory, except for 5 March, where the predicted production exceeded the actual production.On the cloudy days of 6-7 March, Model2 accurately followed the actual trend.However, the reduction in production at noon on 8 March was not predicted by either model.When ilmeteo.it was used as a source, the results showed a slight deterioration.Specifically, on 5 March, the models predicted very low production.The following day saw well-predicted morning hours, but a peak in afternoon production went undetected.On 7 March, the models underestimated the electricity production.Although the daily electricity production on 8 March was well predicted, the sudden reduction at 12:00 a.m. was not detected by either model.For clear days, Model2 appeared to be more accurate in predicting the hourly pattern, with both models recognizing these as clear days.Only on 11 March was the electricity production underestimated.The statistical indices presented in Table 7 reveal that using NWP from ilmeteo.it yields an RMSE of about 27 Wh with both models.The best results were obtained with NWP from weather.com, with Model2 performing the best, achieving an RMSE of 24.9 Wh.However, it should be noted that the overall performance was lower than in the previous test.Undoubtedly, the inherent errors in the weather forecasts from which the data were derived significantly impacted the estimation of electricity production.
Unlike this study, various other studies relying on NWP data have incorporated solar radiation as a predictive input.For instance, the model proposed in Ref. [32] achieved an NMAE of 6%, consistent with the findings of our current study.Using statistical methods, Giorgi et al. [33] reported forecast NRMSE values of 12.57%, 12.60%, and 10.91% for input vectors involving historical PV output, solar irradiance, and module temperature, respectively.On the other hand, Sharma et al. [34] obtained an NRMSE ranging from 9.42% to 15.41%, also incorporating solar radiation as an input.Therefore, the results of the current study can be considered very good, given the utilization of a reduced number of variables.
In addressing potential challenges, it is important to recognize the impact of varying wind speeds on the method's performance.The training dataset used in this study reflects a diverse range of wind speed conditions.However, it is noteworthy that the reliability of the method may be influenced by the specific patterns of temperature and humidity coupling, which can be different in other locations.In particular, the robustness of the proposed method may face challenges when applied to locations with distinct temperature and humidity pairings.Therefore, it is crucial to acknowledge the limitations of the model's generalization across diverse climatic regions.Subsequent investigations and additional training with datasets from various locations may be required to enhance the method's validity in such cases.
hough the daily electricity production on 8 March was well predicted, the sudden reduction at 12:00 a.m. was not detected by either model.For clear days, Model2 appeared to be more accurate in predicting the hourly pattern, with both models recognizing these as clear days.Only on 11 March was the electricity production underestimated.The statistical indices presented in Table 7 reveal that using NWP from ilmeteo.it yields an RMSE of about 27 Wh with both models.The best results were obtained with NWP from weather.com, with Model2 performing the best, achieving an RMSE of 24.9 Wh.However, it should be noted that the overall performance was lower than in the previous test.Undoubtedly, the inherent errors in the weather forecasts from which the data were derived significantly impacted the estimation of electricity production.Unlike this study, various other studies relying on NWP data have incorporated solar radiation as a predictive input.For instance, the model proposed in Ref. [32] achieved an NMAE of 6%, consistent with the findings of our current study.Using statistical methods, Giorgi et al. [33] reported forecast NRMSE values of 12.57%, 12.60%, and 10.91% for input vectors involving historical PV output, solar irradiance, and module temperature,

Conclusions
Photovoltaics is emerging as a pivotal technology in harnessing renewable energy sources, playing a crucial role in the transition toward a decarbonized energy future.The primary objective of this study was to forecast the electricity generated by photovoltaic panels on the following day.This prediction was achieved using easily accessible input data obtained from weather forecast websites (specifically, air temperature, relative humidity, and wind speed).
To accomplish this goal, two forecasting models with hourly resolution were developed based on artificial neural networks.The first model incorporates various data as inputs, including hourly values, and is supplemented with processed daily values to aid in identifying the type of day (i.e., overcast, clear, or partly cloudy).Subsequent analysis enabled the determination of the relative importance of each variable, leading to the elimination of redundant or unhelpful information.Model2 shares the same objective as the initial model but employs a reduced set of input data while maintaining a similar Energies 2024, 17, 466 20 of 22 accuracy.The training process utilized experimental data gathered over three years at the University of Calabria.Notably, it was found that only a single hidden layer for the feedforward networks was sufficient, eliminating the need for multiple hidden layers.The key conclusions drawn from this study include the following: (1) The day of the year is not important for the prediction, as similar information is provided by the minimum and maximum daily temperatures.(2) The daily minimum relative humidity correlates with the daily PV energy production, with a good Pearson's coefficient: −0.88.(3) The models are stable if the input variables have a constant offset error.(4) The most valuable information for the prediction is the hourly temperature trend.(5) The models provide very good estimates when using experimental data as inputs.
The coefficient of determination is about 0.95, with an RMSE of about 15 Wh.(6) The accuracy of the forecast slightly decreases when the input information is taken from weather forecast websites.The coefficient of determination was 0.879 in the two weeks analyzed.The RMSE was 24.9 Wh.The accuracy of the forecast is closely linked to the accuracy of the NWP data.The results are dependent on source data, but they are nevertheless appreciable.(7) The good behavior of Model2 implies that it is not necessary to provide too much information.Hourly trends of the three meteorological quantities and the daily minimum temperature are sufficient.
The limitation of this study is that the networks were trained on local climatic conditions.It would be interesting to assess whether they also perform well in different locations.Despite this limitation, our research has yielded valuable insights into electricity generation forecasting, addressing the challenges posed by the variable availability of solar sources-a concern that is gaining significance.The findings derived from experimental measurements offer valuable information for understanding the factors that exert the most influence on forecasting accuracy.
The practical implications of this work extend to utilities, where the economic impact is manifested through energy savings achieved via effective scheduling of electrical loads and an increase in self-consumed energy.The model, reliant on easily accessible data from websites, is usable by everyone.Leveraging public data ensures the seamless expansion and integration of this technology into control systems, facilitating its broader applicability.The incorporation of these models into smart grid frameworks represents a promising trajectory.The models' hourly resolution aligns seamlessly with the dynamic nature of smart grids, enabling real-time adjustments and fostering an interconnected, responsive energy ecosystem.Microgrid architectures, which often rely on renewable sources, could benefit from the precision of these models in adapting to the fluctuations inherent in distributed energy systems.By facilitating informed decision-making in energy consumption patterns, these models contribute to the broader mission of transitioning towards sustainable and environmentally conscious energy practices.Moreover, as technological landscapes evolve, the adaptability of these models can be explored in conjunction with emerging technologies such as IoT (Internet of things) devices and advanced sensors.In essence, the hourly PV models' versatility positions them as catalysts for holistic advancements in the realm of renewable energy utilization.

Figure 5 .
Figure 5. Correlation between the inputs and target.

Figure 5 .
Figure 5. Correlation between the inputs and target.

22 Figure 6 .
Figure 6.RMSE on the hourly energy produced with reference to the validation dataset.Networks trained by eliminating single inputs from Model1.

Figure 6 .
Figure 6.RMSE on the hourly energy produced with reference to the validation dataset.Networks trained by eliminating single inputs from Model1.

Figure 8 .
Figure 8. RMSE on the hourly energy produced with reference to the validation dataset.Networks trained by eliminating single inputs from Model2.

Figure 8 .
Figure 8. RMSE on the hourly energy produced with reference to the validation dataset.Networks trained by eliminating single inputs from Model2.

Figure 9 .
Figure 9. Daily electrical energy production with the training and test datasets.

Figure 10 .
Figure 10.Daily PV energy output-target regression with the test dataset.

Figure 9 .
Figure 9. Daily electrical energy production with the training and test datasets.

Figure 9 .
Figure 9. Daily electrical energy production with the training and test datasets.

Figure 10 .
Figure 10.Daily PV energy output-target regression with the test dataset.

Figure 10 .
Figure 10.Daily PV energy output-target regression with the test dataset.

Figure 11 .
Figure 11.Hourly electrical energy production with the test dataset.Figure 11.Hourly electrical energy production with the test dataset.

Figure 11 .
Figure 11.Hourly electrical energy production with the test dataset.Figure 11.Hourly electrical energy production with the test dataset.

Figure 11 .
Figure 11.Hourly electrical energy production with the test dataset.

Figure 12 .
Figure 12.Hourly PV energy output-target regression with the test dataset.

Figure 12 .
Figure 12.Hourly PV energy output-target regression with the test dataset.

Figure 13 .
Figure 13.Increase in RMSE for  with errors in input variables.3.4.4.Tests with NWP Data

Figure 13 .
Figure 13.Increase in RMSE for HPE with errors in input variables.

Table 3 .
Definition of ANN1's architecture.Statistical indices on the validation dataset.

Table 4 .
Definition of ANN2's architecture.Statistical indices on the validation dataset.
max has the strongest correlation with DPE, with a coefficient of 0.72.Additionally, these temperatures demonstrate an inverse correlation with relative humidity data.-Relativehumidity (RH min , RH max , and RH avg ): RH min is a critical parameter, as it exhibits a strong negative correlation with DPE (Pearson coefficient of −0.88).This suggests that it is an important parameter for estimating daily electricity production.In contrast, the average relative humidity (RH avg ) appears to be less critical, as it correlates well with several other known variables, providing similar information.The maximum relative humidity (RH max ) often saturates at 100%, which could be used by ANNs to gauge the level of cloudiness.
-Average wind speed (ws avg ) and hourly wind speed (ws): These wind speed variables do not display strong correlations with any other inputs.-Hour of day (HoD): This proves indispensable, as it demonstrates minimal correlation with other variables.This means that the unique information that it provides to ANN2

Table 5 .
Definition of ANN3's architecture.Statistical indices on the validation dataset.

Table 5 .
Definition of ANN3's architecture.Statistical indices on the validation dataset.

Table 6 .
Statistical indices for  on the test dataset.

Table 6 .
Statistical indices for HPE on the test dataset.

Table 7 .
Statistical indices for  with NWP.

Table 7 .
Statistical indices for HPE with NWP.