Photovoltaic Power Prediction Using Artiﬁcial Neural Networks and Numerical Weather Data

: The monitoring of power generation installations is key for modelling and predicting their future behaviour. Many renewable energy generation systems, such as photovoltaic panels and wind turbines, strongly depend on weather conditions. However, in situ measurements of relevant weather variables are not always taken into account when designing monitoring systems, and only power output is available. This paper aims to combine data from a numerical weather prediction model with machine learning tools in order to accurately predict the power generation from a photovoltaic system. An artiﬁcial neural network (ANN) model is used to predict power outputs from a real installation located in Puglia (southern Italy) using temperature and solar irradiation data taken from the Global Data Assimilation System (GDAS) sﬂux model outputs. Power outputs and weather monitoring data from the PV installation are used as a reference dataset. Three training and testing scenarios are designed. In the ﬁrst one, weather data monitoring is used to both train the ANN model and predict power outputs. In the second one, training is done with monitoring data, but GDAS data is used to predict the results. In the last set, both training and result prediction are done by feeding GDAS weather data into the ANN model. The results show that the tested numerical weather model can be combined with machine learning tools to model the output of PV systems with less than 10% error, even when in situ weather measurements are not available.


Introduction
Anthropogenic climate change and increasing levels of pollution are critical issues that continue to motivate the transition from the use of fossil fuels to renewable energy sources. The Paris Agreement, signed by 196 countries during the 2015 United Nations Climate Change Conference, still drives most United Nations' members to reduce their dependency on non-renewable energy sources. After stalling in 2018, installed renewable power is expected to increase to 1200 GW between 2019 and 2024. Solar PV energy sources are expected to account for more than half of this increase, followed by onshore wind systems [1]. Despite the disruptions caused in global energy markets by the Covid-19 pandemic, the total generation of renewable energy is expected to increase by up to 5% in 2020 [2].
Most renewable energy technologies rely on atmospheric conditions to generate electric power. Wind speed and direction determine the performance of wind turbines. Solar irradiation is the key factor that controls the power output of photovoltaic (PV) and thermal solar systems, with other variables like air temperature and humidity also affecting their performance. Reliable weather data is required to either evaluate the best placement for a new renewable installation or predict the future output of an existing system [3,4]. However, meteorological data is not always monitored at the location of renewable energy production systems, so historical local datasets may not be available. When in situ weather measurements are not available, physical or statistical models for computing the required variables [5], spatial interpolations of data from regional networks of weather stations [6], or numerical weather prediction (NWP) models [7] can be used.
Several studies have attempted to forecast weather related variables using artificial neural network (ANN) models. In [8], nonlinear autoregressive (NAR) neural networks were used to predict the fluid flow rate in shallow aquifers. In [9], changes in precipitable water vapour were estimated by a nonlinear autoregressive approach with exogenous input (NARX). In [10], temperature and wind speed were predicted using different upgraded versions of convolutional neural networks (CNN). In [11], the performance of multilayer perceptron (MLP) and multigene genetic programming (MGGP) neural networks for estimating the solar irradiance in PV systems are compared.
In many publications, different regression and machine learning algorithms have also been used in combination with weather data from NWP to forecast future PV power. In [12], the GPV-MSM mesoscale model (5 km of horizontal resolution) of the Japanese Meteorological Agency was combined with a support vector regression algorithm. In [13], the Global Forecast System (GFS) 0.5 product (0.5 • horizontal resolution) of the National Oceanic and Atmospheric Administration of the United States was used to feed a multivariate adaptive regression splines model. In [14], an ANN model was trained with a numerical model from the European Centre for Medium-Range Weather Forecasts, although the name of the particular numerical model was not specified. ANNs are a family of machine learning techniques which are widely used to predict PV system output. An extensive review on forecasting techniques applied to PV power is presented in [15]; in that publication, ANN models are described as the most widely used option for PV forecasting among statistical, physical, and hybrid techniques. Another review on this issue is presented in [16], where ANN forecasting models are again the most widely represented algorithms.
The Global Forecast System is one of the most widely used global-scale NWP models [17][18][19], providing public, freely-licenced, hourly weather forecasts over different grids of horizontal and vertical points covering the entire planet. Four GFS products are currently operated: the 1.00 • , 0.50 • , and 0.25 • horizontal grid resolution models, and the surface flux (sflux) model, with a horizontal resolution of roughly 13 km. The Global Data Assimilation System (GDAS) is another NWP model. It is used to process observational measurements (aircraft, surface, satellite, and radar) which are scattered and irregular in nature, and place them into a gridded, regular space. Those gridded measurements are then used by other models like GFS as a starting point to develop weather forecasts. Four GDAS products are currently active, with each one feeding initialisation data to one of the four aforementioned GFS products. Both the GFS and GDAS models are developed and maintained by the National Oceanic and Atmospheric Administration (NOAA), a governmental agency belonging to the United States Department of Commerce. More details about GFS and GDAS physics and subsystems can be found in [20].
One issue with GFS products is the lack of open, long-term data repositories offering access to their data outputs. Although all GFS and GDAS products are publicly available at the NOAA Operational Model Archive and Distribution System (NOMADS) near-real-time repository of NWP input and output data, each individual file is only stored there for 10 days [21]. For long-term storage, the NOAA maintains the Archive Information Request System (AIRS) [22], but only the coarser resolution GFS products (1.00 • and 0.50 • ) are stored. However, this is not the case for GDAS products; outputs from the 0.25 • product have been stored at the AIRS repository since June 2015, while sflux files have been stored since February 2012.
In this study, a real photovoltaic installation located in the South of Italy is modelled using ANN, and its PV power production is predicted. Two data sources are used to feed the neural network models. The first one includes experimental weather and power production data gathered in [23] from the real PV system during 2012 and 2013. The second source uses weather data from the GDAS sflux model, one of the NWP models from the NOAA with the highest spatial resolution, and the one with the highest resolution outputs that are publicly available for the temporal span of the experimental data campaign. Three scenarios are designed with different combinations of monitoring and GDAS weather data to train and test the neural network model. PV power output data from the monitoring campaign are used in all scenarios.
The fundamental idea behind the present study (shared with previous works by the same authors) is to evaluate the performance of global-scale data sources as either complements or replacements of more local-or regional-scale data sources (either forecasts or measurements) for different uses. This approach implies a lower quality of the available particular values. Performance differences between global-scale models and local sources may be larger or smaller depending on the quality of said models, but the former are not expected to be better except for outlier cases (e.g. using faulty or non-representative measurements, or ill-tuned local models). However, local data sources have relevant issues related to the lack of standardisation (requiring ad hoc retrieving and processing solutions) and the lack of historic data (e.g., the unavailability or excessive cost of the required amounts of data). If the differences in data accuracy and precision for global-scale models can be kept within an acceptable threshold level, they may be outweighed by the benefits of said models regarding the aforementioned issues.
Multiple studies have already combined different NWP models with statistical algorithms in order to forecast PV generation outputs, including ANN algorithms. To the best of the authors' knowledge, however, very few studies have combined weather data from the GDAS products with ANN models, and none has used GDAS products for photovoltaic power forecasting. Hence, the novelty and main objective of this study is to evaluate the performance of the GDAS sflux model as a replacement for in situ weather measurements with the aim of predicting photovoltaic power generation.

Modelled System and Study Conditions
The photovoltaic system analysed in this article is located in the campus of the University of Salento in Monteroni di Lecce, Puglia, located in south Italy. The installation, fully described in [24], is grid-connected and covers a net area of 4710 m 2 on the roofs of the parking lots of the aforementioned university. It is comprised of 3000 monocrystalline silicon modules connected in series, with a total nominal power of 960 kW p . The azimuth of all modules is −10 • , since they are oriented to the south east. However, two groups of modules can be differentiated according to their slope within the PV system: 3 • (PV1); and 15 • (PV2). Table 1 includes the main specifications of both groups. The location of the present study matches the coordinates of the campus where the PV system is placed, i.e., 40 • 19 32"16 N, 18 • 5 52"44 E. This study area is characterised by a Mediterranean climate, with warm winters and dry summers. It is located 18 km from the nearest shore line, with an average elevation of 35 m above sea level. The starting and ending dates of this study are conditioned by the duration of the weather and PV power monitoring campaign done in [23] over said installation, as explained on the next subsection. Hence, the chosen temporal span of the study starts on 6th March 2012 at 00:00:00 UTC and ends on December 30th, 2013 at 23:59:59 UTC.

Monitored Data
The first data source used in this research is the measurement acquisition work carried out in [23] regarding the same PV system that is modelled on the present study. Said installation is monitored by means of two types of sensors for weather variables: an irradiance sensor LP-PYRA02 to measure the hourly mean solar irradiance on the two inclined surfaces (3 • and 15 • ), and a PT100 temperature sensor to gauge the hourly mean air temperature and the temperature of the PV modules. Hourly mean photovoltaic power data for the entire installation is extracted from three inverters. The air temperature, the two solar irradiations, and the output power compose the monitoring dataset, the first of two used in the present study.

Numerical Weather Model Data
The second data source used in this study comes from one of the NWP models belonging to the NOAA. The GDAS sflux model, which assimilates scattered observational measurements into a regular grid of data used to initialise the GFS sflux forecast model, was chosen. The GDAS sflux, like its GFS counterpart, has been regularly evolving since its creation, improving its resolution and its physics and numerical components. The version used in this study applies a hybrid Eulerian grid (technical identifier T574) with 23 km of horizontal resolution and 64 vertical levels. Like all GFS and GDAS products, it is solved four times each day, with cycles starting at 00, 06, 12 and 18 h UTC. Each cycle, it generates data outputs with three-hourly resolution up to 9 hours. Only the most recent forecast hours (00 and 03) of each cycle are used, as weather conditions from the farther hours (06 and 09) can be more accurately described by the first hours of the next cycle. The output of this product, starting from 13th February 2012, is available from the AIRS online repository. Thus, the same time span of the monitoring dataset can be covered with GDAS data.
The nearest neighbour point of the GDAS sflux horizontal grid is located at coordinates 40 • 22 29"21 N, 18 • 0 1"56 E. The horizontal distance between said point and the location of the modelled PV system is roughly 9.9 km. This distance is considered small enough to directly use the data from the GDAS point, as in a nearest neighbour strategy, instead of applying more complex spatial interpolation algorithms.
The output files generated by the different GDAS models contain results for multiple weather variables for all the points of their horizontal grids and vertical levels. For temperature, different variables are available for instantaneous, average, maximum, and minimum values at soil-, distance-, and pressure-based vertical levels. For solar irradiation, two fluxes (short-and long-wave) are available with two propagation directions (downwards and upwards) at the local surface level, along with other levels. The instantaneous air temperature predicted at 2 m over the local surface and the instantaneous downward short-wave radiation flux at local surface level are the chosen variables, as they are most analogous to air temperature and global solar irradiation as measured by a weather station [25]. Those two variables compose the GDAS dataset, the second one used in this study.

Auxiliary Data
Temperature and solar irradiation values at each time step are the key components that decide the power output of photovoltaic panels. However, there are other variables that may modulate the effect of those weather variables. In particular, the position of the Sun affects the amount of solar radiation flux that can be used by the solar panels. Said position follows a daily cycle between dawn and dusk, but also a yearly cycle due to the relative positions of the Earth and the Sun, leading to differences in solar flux incidence angles between winter and summer, and between early morning, noon, and late afternoon.
To include the aforementioned differences in the prediction models, two additional input variables are used: the hour of the day, ranging from 0 to 23, and the hour of the year, ranging from 0 to 8759. These two auxiliary variables are inspired by a previous study by one of the authors [26] where they were both used (in addition to the hour of the week) to characterise the evolution of thermal demands of a large building during different time periods. However, the nature of the system modelled in the present study does not have a direct dependency on the day of the week, so the hour of the week is not included as an input variable. The two selected variables are added to both monitoring and GDAS datasets.

Data Preprocessing
Real data measured with automatic sensors are often incomplete or contain errors and inconsistencies, so data preprocessing is crucial. Cleaning and organisation techniques prepare the data and make them suitable for use with machine learning models. In particular, modifications applied to the monitoring dataset are focused on fixing minor inconsistences and removing erroneous or missing data. The dataset contains many empty records. In some of the night-time slots (between 10 p.m. and 3 a.m., both inclusive), no data is available since no measurements are collected. The solar irradiation is known to be zero (by definition) during night, but that is not the case for air temperature. However, the fact that no temperature data are available at night is not an issue; since there is no photovoltaic power production at night, it is therefore an irrelevant period. Adding night-time data would provide redundant information that would increase the complexity of the model and the calculation time needed without producing relevant results.
To avoid empty records that could negatively affect learning models, rows containing null data are removed. The same procedure is used when duplicated values or incomplete records are detected, where at least one of the variables is missing. Some discrepancies are also corrected, i.e., some records have a time shift of a few hours, probably due to an error in the data dumping. To correct this, the times of sunrise and sunset are taken into account as a reference. The times of sunrise and sunset are determined as indicated in [27], considering the dates, latitude, and longitude of the location of the facility and calculating the declination by Spencer's method and the sunset hour angle. Reference dawn, noon, and dusk hours for 2013 at the study location are shown in Figure 1. The GDAS dataset is initially composed of discrete values for one in every three hours only (i.e., hours in the [00:00, 03:00,..., 21:00] group, which will hereafter be called 'GDAS-hours'), while the monitoring dataset contains hourly measured values. A fast correction for ensuring compatible time coordinates between datasets would be to filter the monitoring dataset to remove 'non-GDAS-hours' (i.e., any hour not belonging to the GDAS-hours group). However, this coarse time discretisation would fail to capture relevant points for the solar irradiation and PV power variables. As shown in Figure 1, neither noon (maximum irradiation in clear sky conditions) nor most dawn (irradiation start) or dusk (irradiation end) times are close to any of the GDAS-hour instants.
To avoid losing information from the monitoring dataset, values from the GDAS dataset must be interpolated over their temporal coordinates to an hourly resolution. Second order B-splines are used to generate continuous piecewise polynomial functions for both the global horizontal irradiation and air temperature variables of the GDAS dataset. These piecewise curves are fitted to the available GDAS-hours values of temperature and irradiation. For the latter, additional fitting points are included for the dawn and dusk instants of each day (with a 0 W/m 2 irradiation value). After generating the fitted curves, values for the non-GDAS-hours instants are extracted and added into the dataset. As all the original GDAS data correspond to either forecasts of 0 or 3 h, it can be stated that the interpolated GDAS dataset only contains forecasts inside the 0-5 hourly range.
As mentioned, the monitoring dataset contains two solar irradiation variables, each measured at a different tilt angle (3 • and 15 • ). The GDAS dataset, however, only contains values of irradiation on a horizontal plane (same convention as most general-purpose automatic weather stations). As increasing differences on tilt angle cause increasing divergences on irradiation values, care must be taken when comparing irradiation variables from the two datasets. After some preliminary analyses, it was found that the behaviour of irradiation variables with tilts of 3 • (the first irradiation variable from the monitoring dataset) and 0 • (the one from the GDAS dataset) were sufficiently similar. Thus, the irradiation measured at 15 • is removed from the monitoring dataset, while that measured with a tilt of 3 • is used for comparisons against the global horizontal irradiation from GDAS directly.
Once both datasets are preprocessed, they are compared to find faulty time instants. If at least one of the variables has a missing value at a given instant, that instant is entirely removed for all variables of both datasets. This reduces the final available number of hours of data, but ensures that all hours fed into the prediction models are complete. After this and previous filters have been applied, 11,132 valid hours of data remain. This value corresponds to 69.6% of the total hours belonging to the temporal span of the study, including all night hours that are not present in the monitoring data included in [23].

Estimation of Photovoltaic Power: Artificial Neural Network Model
The ANN models used in this study are multi-layer perceptrons, a class of neural networks composed of multiple layers of interconnected artificial neurons. The first is the input layer, which ingests the different input variables fed into the neural network model. The last is known as the output layer, which yields the predicted values generated by the model. Between these two layers, there is a variable number of hidden layers (zero or more) which connect the input and output layers. The number of artificial neurons in the output layer is determined by the number of output variables. However, the number of neurons for the input and hidden layers is not predetermined, but rather a variable parameter to be optimised.
All the different MLP models built for this study share a common set of configuration parameters. The neurons in both the input and hidden layers use the rectifier linear unit (ReLU) function as the activation function [28]. Batch learning is used as the training methodology [29]. The chosen optimisation algorithm is Adaptive Moment Estimation (Adam), a stochastic gradient descent method [30]. The training stop criterion uses a separated fraction of the training data (a randomly chosen 10% of the training hours) to evaluate the current stage of the trained model, using the mean absolute error (MAE) metric shown below. An additional patience-based stop criterion is included, where training stops if the performance of the model does not improve for 100 consecutive epochs.
In the next subsections, the monitoring and GDAS datasets are combined into three different scenarios to train and test the different MLP models. At the same time, the available days of data are split into training and testing samples. These samples, applied over the different scenarios, are used to select the best MLP model and to evaluate its performance when fed with monitoring and/or GDAS meteorological data.

Training and Testing Scenarios
Three different scenarios, with combinations of the two data sources, are used in this study. For the first one, the ANN model is both trained and tested using weather input data from the monitoring dataset. For the second scenario, the model uses the same training with measured data as the first one, but the testing is done using weather inputs from the GDAS dataset. For the third scenario, both training and testing are done using GDAS weather input data. PV power outputs from the monitoring dataset are used in the training stages for all three scenarios. A schematic of the different scenarios is presented in Figure 2. The differences between the first and second scenarios lie exclusively in the testing part of the method. Their ANN models are trained using the same input data, following the same chain of operations, and have virtually the same values on their weight matrices. The results from the first scenario provide a way of evaluating the ability of the ANN to model the studied photovoltaic system. The predictions done by the model of the second scenario are affected by both the nature of the GDAS data and the modelling of the photovoltaic system done by the neural network. Hence, a comparison between the predictions of the first and second scenarios enables an independent analysis of the effects of the GDAS data. The third scenario assumes that no local measured weather data are available, and is able to determine whether the GDAS dataset is adequate to entirely replace these local measurements in the weather data.

Training and Testing Dates
The days comprising the temporal interval of the study are divided into a training sample and three testing samples. The training sample is fed into the ANN algorithms to build the photovoltaic power estimation models, while the testing samples are used to evaluate the performance of the estimation models.
First, all complete available days are filtered. Here, a complete day is defined as any day containing at least all its daytime hours (hours between dawn and dusk instants, computed as indicated in the Data Preprocessing section) and with no missing data for all its available hours (in the monitoring or the GDAS datasets). From this pool of complete days, two days are selected at random from each month and added to the first and second test samples, respectively.
The random selection algorithm ensures that all complete days from each month have the same probability of being included on either test sample. It also ensures that no day belongs to both random test samples. The first and second testing samples each comprise 22 individual days. These samples can be considered representative of the entire temporal span of the study, as all months are guaranteed to be equally represented. This means that all seasonal weather conditions are taken into account when testing the performance of the estimation models.
Two complete weeks of data are handpicked to form the last testing sample from the remaining complete days not belonging to any of the two random test samples. They span from 28th April to 5th May 2012, and 12th April to 19th April 2013. This sample is to be used to graphically represent the performance of the weather and power output variables. The two complete weeks have enough duration to capture the multidaily-resolution features of the studied variables, while still being able to spot hourly-resolution patterns. However, days belonging to this testing sample are not to be considered representative of the entire temporal span of the study, as they were manually chosen.
Finally, the training sample includes all days not belonging to any of the three testing samples, and comprises the majority of the available data. This training sample is fed into the different ANN architectures to build the estimating models.

Selection of Best Model
Once the training and testing temporal samples have been selected, the different ANN models are built. The number of hidden layers (varying between zero and two) and the number of neurons inside each of the input and hidden layers are the only configuration parameters that differ between models. The number of neurons of the output layer is fixed by the number of output variables (only one, i.e., PV power production), as already stated. All other configuration parameters are kept constant.
With the required parameters defined for all neural network models, the training dataset of the second scenario is fed into each ANN model, using hours of data from the training sample. Then, the testing dataset from the same scenario is fed into each trained model to generate PV power estimations, using the first testing sample of random days. These estimations are compared against the corresponding monitored values of PV outputs, and mean nRMSE values are obtained for each tested ANN model. The same process is done again, training and testing each model from zero using data from the third scenario.
The best neural network model is chosen considering the combination of the nRMSE mean errors from the second and third scenarios. This best ANN model is comprised of an input layer of five neurons and a single hidden layer of 30 neurons, in addition to the single-neuron output layer. This selected architecture is shown in Figure 3. The second and third test samples are finally used to evaluate the performance of the chosen model for all three scenarios, as explained on the Results and Discussion section.

Error Measurements
Three error metrics are used to evaluate the performance of the ANN model when predicting photovoltaic power, and also to compare the two datasets used on each of its input variables: mean bias error (MBE), root mean square error (RMSE), and normalised root mean square error (nRMSE). They are defined using the three following equations.
Where X i is an individual value from one of the variables of the GDAS dataset (estimations), Y i is the corresponding value from the monitoring dataset (observations), Y max is the largest value of the entire monitoring dataset (43.53 • C for temperature, 1045.48 W/m 2 for solar irradiation and 848.66 kW for PV production), and N is the number of data hours (sample size).
The MBE provides a measurement of the general bias of a given variable, while the RMSE provides more information about individual discrepancies with reality for a large set of estimations. Both of them compute error values in physical units. The nRMSE is a nondimensional version of the RMSE. It is useful for evaluating the relative performance between unrelated magnitudes. In the case of power generation systems, it can be used to directly compare the output errors from modelled systems with different nominal capacities [16].

Results and Discussion
In order to properly evaluate the performance of GDAS numerical data when coupled with an ANN model for predicting the power generation of a PV system, both the input weather and the output power variables are analysed. First, input values from the GDAS dataset are compared with those of the monitoring dataset, for both temperature and solar irradiation inputs. Then, PV power predictions obtained from the ANN model under the three training and testing scenarios are compared with the real PV measurements taken from the monitoring dataset. Figure 4 compares solar irradiation values from the monitoring and GDAS datasets, for each of the two manually picked weeks included in the test samples. For each time period, both the real (monitoring) and estimated (GDAS) signals, and the difference between those signals, are represented. Temperature graphs for manual weeks, and both temperature and irradiation graphs for random days, are omitted for the sake of brevity. Table 2 shows the MBE, RMSE, and nRMSE metrics results for temperature and solar irradiation in the GDAS dataset. Errors are computed using the corresponding values from the monitoring dataset as a reference. The results are computed for three different periods: the entire temporal span of the study, the 22 random chosen days from the second test sample, and the two manually picked weeks from the third test sample. Individual error values are calculated for each day of each period, and then the mean and standard deviation values of those errors are computed.  It is clear that GDAS has a moderate tendency to underestimate both mean air temperature and solar irradiation variables for the studied location and temporal interval. However, MBE values are much closer to zero than their RMSE counterparts for the entire study span and the 22 randomly chosen days, especially for solar irradiation. For the two manually chosen weeks, which cover similar dates of 2012 and 2013, distances towards zero are not so divergent between both metrics. This behaviour shows that underestimations and overestimations of weather variables for individual days tend to almost compensate one another for periods of time that comprise a larger variety of weather conditions. Mean and standard deviation values of nondimensional errors for solar irradiation are lower than their temperature equivalents on all temporal samples.

Analysis of Weather Inputs
Key instants on the daily behaviour of irradiation curves, namely, dawn and dusk hours and the hour of maximum value, match closely between the monitoring and GDAS variables, as can be seen in Figure 4. The same behaviour is found for the randomly picked days; for almost all days, the GDAS estimations and the monitoring data have their maximum value at the same hour or with a one-hour difference, while dawn and dusk hours both match for most days with maximum differences of one hour.
When comparing sample periods, errors for the randomly chosen days are more similar to those of the entire study span when placed against errors for the manually chosen weeks. This indicates that the former sample is more representative of the global behaviour of both weather variables for the study span. Although the sample of two weeks provides an efficient way of visualising specific features of the solar irradiation inputs, numerical error metric values for said sample should be treated as examples, and should not be generalised.
The main objective of the present study is related to the prediction of PV power outputs, and not to air temperature or solar irradiation. However, both of these weather variables are key inputs of the ANN model used for predicting photovoltaic power, particularly solar irradiation. As the GDAS dataset contains interpolated weather values for every two out of three hours, it is sensible to compare the error values in this dataset with those of similar studies. For example, [31] uses an ANN model trained with the LMA algorithm to forecast surface irradiation from extraterrestrial solar irradiation in the South of China, among other inputs, using both historical data series and statistical feature parameters to feed the models. Their RMSE results are~43 W/m 2 for sunny days, and~85-255 W/m 2 for cloudy days (using statistical parameters or historic series as inputs). The RMSE irradiation errors from Table 2 are larger than the statistical-based-predictions from [31], but lower than historic-based-predictions, which are more comparable.
It must be highlighted that GDAS is a NWP model, meaning that the derived meteorological data are estimated (forecasted) values. For the particular case of GDAS, this estimation includes the interpolation of scattered real observations into a regular grid of points, and also a very-short-term forecasting of future weather conditions. These spatial interpolations and temporal extrapolations imply a certain level of uncertainty. In addition, GDAS forecasts have a three-hour resolution. Instantaneous air temperature and solar irradiation values at each three hours are taken from GDAS outputs and interpolated to a one-hour temporal resolution using second order B-splines. This means that short-term phenomena cannot be captured by the GDAS model (e.g. transient shadows due to passing clouds). Even though these issues with GDAS data remain, the resulting weather values (particularly solar irradiation) are shown to have a reasonably good match with the monitored data. The nRMSE errors for temperature and irradiation are lower than a 17% and 12%. RMSE errors for solar irradiation are in accordance to those found in related bibliography, and estimated key instants of the daily behaviour of these variables (dawn, noon, and dusk) closely match those of the monitored data. Figure 5 shows photovoltaic power outputs for the same two weeks used in Figure 4, again including real (Monitoring) and estimated (Scenarios 1-3) signals, and the differences between them. Output values for the three analysed scenarios are included, as well as values from the monitoring dataset for reference. Again, temperature graphs are omitted.

Analysis of PV Power Predictions
A comparison between solar irradiation inputs and PV outputs for four of the randomly chosen days is shown in Figure 6. They cover situations of good and bad matching between monitoring and GDAS solar irradiation for both 2012 (first and second subplots, corresponding to days of March and April 2012) and 2013 (third and fourth subplots, for days of August and October 2013).  Table 3 shows the error results for photovoltaic power predictions for the three scenarios. The same considerations from Table 2 apply here. In this new table, only the random days and manual weeks are used. The entire temporal span of the study is not considered in order to avoid including data values already used during the training of the ANN model.
The performance of the predicted PV output variable is reasonably good on all scenarios and for both samples of random days and manual weeks. As with the input variables, MBE results for power predictions are much closer to zero than their RMSE counterparts, meaning that power biases for individual days and hours tend to be nullified for more complete periods of time. RMSE errors on power production are lower than 25 kW (first scenario) and 85 kW (second and third scenarios). The real magnitude of these errors is easier to evaluate when compared with the 960 kW p of peak production installed for the monitored PV system, and the actual maximum value of PV production from the monitoring dataset, i.e., 849 kW. In fact, nRMSE values are lower than 3% for the first scenario, and lower than 10% for the other two, with maximum standard deviation values of 5%.
Graphs of power outputs and irradiation inputs for the 22 randomly chosen days tested (as well as those of the two manually chosen weeks) show almost identical patterns. This can be seen in the examples in Figure 6, in which power prediction curves from the second and third scenarios perfectly match the solar irradiation curve predicted by GDAS, while the power prediction curve from the first scenario replicates the solar irradiation curve provided by the monitoring dataset. This behaviour is to be expected due to the high correlation between the solar irradiation and the power output of a PV system [16,32]. Key instants on the daily behaviour of PV production curves (starting, peak, and ending production hours) for the monitoring data match closely with the predictions of the ANN model in all three scenarios, in the same fashion as with solar irradiance.  Power outputs from the first scenario are most similar to measurements from the monitoring dataset, with very low errors (mean RMSE lower than 25 kW) and almost identical curves for all random days and manual weeks. These similarities are to be expected, as the neural network model uses monitoring data for both training and testing on this first scenario. The performance of PV predictions from the second and third scenarios are also quite similar, in terms of both error metrics (except mean MBE values for manual weeks) and graphics. This is a remarkable result, as the ANN models from these scenarios are trained using weather inputs from different datasets, even when they both use the same weather data to test their performances. Although the performance of the second and third scenarios is not as good as that of the first one, it is still quite significant, with mean bias errors of less than 14 kW below the real outputs, and mean nRMSE errors lower than 10% of the maximum power output measured for the entire study span.
As with solar irradiation, it is sensible to compare results for PV production with those available in the literature. In [33], next-day forecasts of power outputs from a 264 kW p PV plant in the North of Italy were generated using an ANN model, trained with the error back-propagation method, and coupled with a clear sky solar radiation model. The error analysis took into account three different significant days with sunny, partially cloudy, and cloudy weather conditions, obtaining nRMSE values of 12.5%, 24%, and 36.9% respectively. In [32], nRMSE values in the 10.91-23.99% range were achieved when predicting power output forecasts in a horizon of 1-24 h for the same PV system as that modelled in the present study (a 960 kW p installation located in the South of Italy), using an Elman ANN model. In [34], a 1 MW p PV plant in California was modelled using a feed-forward ANNs model and a genetic algorithms/ANN (GA/ANN) hybrid model, among others. The reported RMSE errors for forecasts 1 and 2 h forward were 88.23-142.74 kW for the ANN model, and 72.86-104.28 kW for the GA/ANN model. The reported nRMSE values were computed using a different definition, thus making comparisons with the present study unreliable.
The nRMSE values from Table 3 for random days in the second and third scenarios are equivalent to those of the best-performing cases in [33] (sunny days) and [32] (one-hour-horizon forecasts). The corresponding RMSE values are slightly lower than those obtained in [34] using their ANN model for a forecast horizon of one hour, and slightly worse than those from their GA/ANN model. It is worth noting that no distinctions are made between prediction performance for sunny or cloudy days in the present study. However, the selection of test days ensures that both high-irradiation days (typical of the summer season and clear sky conditions) and low-irradiation days (typical of the winter season and clouded sky conditions) are represented in the random test sample. Also, no distinctions between forecast hourly-horizons are made here, but the nature of the GDAS dataset implies that all data belong to a forecast horizon from 0 and up to 5 h, as explained in the Data Preprocessing section. The modelled PV system presented in [34] is not identical to the one in the present study. However, the peak output productions from both installations are similar enough, so at least a soft comparison of dimensional metrics like the RMSE can be made.

Comments on Training and Testing Scenarios
After presenting and commenting on the relevant results for photovoltaic power predictions in the three different training and testing scenarios, a brief discussion about the implications of each scenario is in order.
The first proposed scenario both trains and tests the performance of the chosen ANN model using data from the monitoring dataset. No inaccuracies due to the forecast nature of GDAS data are fed into the neural network model here. Hence, all differences between predictions and actual values are due to the mathematical modelling of the PV system done by the ANN model. As all the tested error metrics have great performance in this scenario, it can be concluded that the proposed neural network model is indeed adequate to model the studied installation.
The second proposed scenario uses the same training as that of the first scenario, and thus, the resulting trained model is virtually identical, and the same conclusion about its modelling performance applies. The testing part here is done using GDAS data. As already stated, differences in the prediction performance between the first and second scenarios can be attributed to uncertainties in the GDAS data. The error values are indeed higher for this second dataset, but relative differences are not drastic. They can still match (and even outmatch) the performance of other forecast strategies reported in the literature, as stated in the previous subsection. For a generic monitored PV installation, this scenario would be translated as having an ANN model trained with a historic dataset of in situ measurements of weather variables. In this case, GDAS data could still be used to predict power production for the next hours based on GDAS forecasts, or to fill missing days in historic datasets.
The third and last scenario completely replaces the weather data from the monitoring dataset with those from the GDAS dataset, keeping only the measured PV power outputs from the former. Here, the trained ANN model differs from that of the other two scenarios (although the same basic configuration is used), so the prior conclusion about the performance of the ANN model does not apply. However, error results for this scenario are most similar to those of the second one, with only slightly worse bias errors and almost identical RMSE and nRMSE errors. This implies that a historical dataset of in situ weather measurements may not be a crucial item when aiming to predict future power outputs with GDAS. Further studies targeting different locations and facilities would be required to confirm this. Nevertheless, it seems that even if the sensors for monitoring relevant weather variables were never available at the location of the PV system, GDAS forecasts could still be used to train an ANN model which is able to predict power production.

Conclusions
The main objective of this study was to evaluate the applicability of the GDAS sflux numerical weather model as a replacement for in situ weather measurements to model the power outputs of a photovoltaic system. Three training and testing scenarios, with different combinations of monitoring and GDAS weather data, were used to feed and evaluate the performance of one prediction model using a multilayer perceptron ANN algorithm. Solar irradiation and air temperature were the main input variables, while PV power production was the only predicted output.
Bias errors on individual days tended to be compensated when considering more complete temporal samples. This happened both on the weather inputs and the PV power outputs.
Mean nRMSE values of 2.9% and 9.9% on PV outputs were achieved for the most representative testing sample in the first and second scenarios, respectively. A comparison between those values led to the conclusion that most of the power prediction errors were due to the approximate nature of the GDAS solar irradiation data. However, the 100.00 W/m 2 mean RMSE error achieved for this weather variable was in accordance with other solar irradiation forecast methodologies included in the bibliography. The neural network model used was shown to model the real power system with solid accuracy.
An analysis of the second scenario indicated that the GDAS sflux product is a reliable source of weather data for forecasting future PV power outputs, when an ANN model built with past in situ weather measurements is already available. The analysis of the third scenario, on the other hand, showed that even when said historic dataset of local weather measurements is not available, GDAS data can be effectively used to train the ANN model, with a minimal loss in the accuracy of PV power predictions.
Less than 10% mean nRMSE errors in PV power outputs were achieved for both the second and third scenarios. A comparison with other relevant studies showed that the errors in the photovoltaic power predictions for all scenarios in this study were analogous to those presented in the solar forecasting literature. The use of GDAS weather data in combination with ANN algorithms makes it possible to predict PV power with a performance that matches or even outmatches other PV forecast methods.
Future research expanding this work could focus on tackling some aspects which were not fully analysed in the present study. The influence of other weather variables could be studied (like wind effects on the cooling of the solar modules, or rainfall removing possible depositions of fine dust and dirt), if reference measured data were available for said variables. Second order B-splines were used here to interpolate weather values for the hours when GDAS data were not generated, but other temporal interpolation methods could be tested. Finally, the performance of the GDAS model could be tested against other NWP models, like a high resolution version of the Global Forecast System.
In conclusion, the present study shows that the GDAS sflux numerical weather model is a reliable source of weather data for photovoltaic power prediction when combined with artificial neural network algorithms. Estimative data from this numerical model can be fed into an existing ANN model, already trained with local weather measurements, or can entirely replace said measurements and be used to train the model when historical local weather data are not available.