I-Solar, a Real-Time Photovoltaic Simulation Model for Accurate Estimation of Generated Power

: Global energy consumption and costs have increased exponentially in recent years, acceler-ating the search for viable, proﬁtable, and sustainable alternatives. Renewable energy is currently one of the most suitable alternatives. The high variability of meteorological conditions (irradiance, ambient temperature, and wind speed) requires the development of complex and accurate management models for the optimal performance of photovoltaic systems. The simpliﬁcation of photovoltaic models can be useful in the sizing of photovoltaic systems, but not for their management in real time. To solve this problem, we developed the I-Solar model, which considers all the elements that comprise the photovoltaic system, the meteorologic conditions, and the energy demand. We have validated it on a solar pumping system, but it can be applied to any other system. The I-Solar model was compared with a simpliﬁed model and a machine learning model calibrated in a high-power and complex photovoltaic pumping system located in Albacete, Spain. The results show that the I-Solar model estimates the generated power with a relative error of 7.5%, while the relative error of machine learning models was 5.8%. However, models based on machine learning are speciﬁc to the system evaluated, while the I-Solar model can be applied to any system.


Introduction
One of the biggest threats facing the world population is climate change, which is primarily caused by the emission of greenhouse gases (GHG) produced by the use of fossil fuels in industrial processes. The benefits of renewable energy (REn) are clearly visible, being a prerequisite to reach socioeconomically sustainable systems, and particularly to address the challenges of climate change and the depletion of fossil fuels. These problems require active policies aiming at a rapid transition [1,2]. Thus, REn is presented as a viable and profitable alternative to the use of conventional sources of electricity [3].
Among the different sources of renewable energy, photovoltaic solar energy is in a period of high growth globally [4]. The most important factor for the establishment of this type of system is the cost [5,6]. However, the price of all components included in a photovoltaic installation has drastically decreased in recent years [7], with a drop of up to 85% in the cost of photovoltaic modules [8]. The improvement of photovoltaic modules and the search for highly efficient new materials [9] or module types [10] has led to an expansion, with high levels of investment in photovoltaic solar energy as an alternative to conventional energy sources. Some studies analyzed the competition for land between photovoltaic energy producers and farmers, which can be balanced in what is called agrivoltaics [11]. Although agrivoltaics can open up an additional revenue stream, there is a high concern by farmers about land affection on the long term.
The energy consumption in irrigation agriculture has exponentially increased in recent years due to the modernisation of irrigation systems, from surface irrigation to pressurised irrigation [12,13]. It is necessary to find more efficient alternatives [14], mainly in areas with high energy demand, such as irrigable areas with underground water resources, where the extraction cost can reach up to 70% of the total energy cost of irrigation [15]. One of the possible alternatives to decrease the water extraction and application costs is the integration of photovoltaic energy [3,[16][17][18]. Photovoltaic pumping systems have already been established in many countries including the USA [19], India [20], Turkey [21], Spain [12,22] and Algeria [23]. However, the photovoltaic pumping systems must be managed properly to obtain quality irrigation, even in complex irrigation systems [24].
Great advances have been made in the development of methodologies to design photovoltaic pumping systems and other uses, such as the Photovoltaic Geographical Information System (PVGIS, http://re.jrc.ec.europa.eu/pvgis/). However, the optimal management of a system, once sized and installed, is a key factor in the satisfactory performance of the system, which requires the development of real-time simulation models. To efficiently manage photovoltaic pumping systems after installation, the following actions need to be taken: (1) generate simulation models of irradiance (W·m −2 ) on inclined surfaces, in order to estimate in an accurate manner the real-time direct, diffuse and reflected components of the irradiance; (2) generate simulation models of photovoltaic generators that represent accurately and in real-time the generated power (current and voltage in direct current (DC) and alternating current (AC)); and (3) characterize the operation of the equipment required to feed the system, such as the variable frequency drive (VFD) and cables. A successful experience was implemented in Palestine by generating a microgrid solar photovoltaic systems for rural development and sustainable agriculture [25]. However, an improvement of the described results would be expected if a more accurate solar model was utilized, as proposed in this manuscript.
In addition, in the process of sizing the photovoltaic pumping systems, the photovoltaic generator is frequently oversized to absorb the high uncertainty in the irradiance data, and can be of up to 30% [26], among other factors. This is a practical measure to ensure the adequate performance of the system, but requires proper regulation, which is usually not well defined. In addition, it increases the complexity of the simulation owing to the necessity of including a regulation algorithm for these cases with excess power generation due to oversizing. Although important developments in photovoltaic simulation models have been realized, no analysis of the effects of the oversizing of the photovoltaic generator on the final operation of the system have been found. Thus, it is necessary for methodologies to generate robust models that allow accurate simulation of the photovoltaic power generated in real time, which is essential for adequate irrigation management.
There are many models to estimate the direct and diffuse components of global irradiance [27,28] that allow these components to be obtained from daily global irradiance values. Erbs et al. [29] developed a model to estimate the diffuse radiation fraction for hourly, daily, and monthly average global radiation. Muneer [30] and Duffie and Beckman [31] established some simplifications, known internationally, considering an isotropic distribution of diffuse irradiance. Shen et al. [32] developed, based on previous references and some modifications, a simulation model of the solar radiation using the Simulink module of Matlab ® (Mathworks Inc., Massachusetts, USA). In addition, there are various applications of free software such as PVGIS that offer statistical data provided by satellites to estimate the solar radiation. The model proposed in this work for the calculation of diffuse irradiance [33] is one of the most used models to estimate the diffuse irradiance on an inclined surface [34], offering accurate results [35]. Thus, it is necessary to develop simulation models that work with irradiance values (W·m −2 ), which ensure adequate management in real-time once the system has been sized.
There are many models to estimate the generated power in photovoltaic modules. These include: (1) based on the calculation of the fill factor [36]; (2) by determining the short circuit current and open circuit voltage variation at a certain temperature of operation [37]; (3) determining the current and voltage at the maximum power point (MPP) [38]; (4) a simulation procedure for the prediction of monthly energy provided by photovoltaic systems based on daily profiles of irradiance and temperature [39]; (5) methods to increase the output efficiency of a photovoltaic system based on finding the maximum power point (MPP) [40], similar to the procedure used to estimate the photovoltaic power [22,41,42]. However, simplified models [43] can lead to overestimations in the power available at the output of the photovoltaic generator.
The simpler simulation models tend to ignore or simplify important parameters in the simulation of photovoltaic generators such as resistive parameters and ideality factors, among others. These parameters are not provided by the manufacturer. However, a method to determine the ideality factor of a real solar cell through the Lambert W function has been described [44]. In addition, a model for obtaining the parallel resistance of a solar cell has been developed [45,46]. Even so, no model has been found that integrates all these contributions to improving the simulation of the photovoltaic systems, which is one of the main contributions of this work.
Automatic learning techniques have been applied to perform regression analysis of highly non-linear problems between the input and output datasets [47]. As an alternative to the parametric models, such as those described above, there is an increasing trend for the use of non-parametric algorithms, mainly based on automatic learning techniques, both for simulation [48] and for prediction [49] of photovoltaic production. The capacity for low-cost monitoring of photovoltaic production systems allows us to obtain a massive quantity of information that is not always used effectively, affording machine learning an opportunity for the treatment and extraction of useful information. Thus, machine learning models can be created to estimate the generated power in real time from simple measured variables, such as irradiance on the horizontal surface, and some meteorological parameters. The main disadvantage of this methodology is that the generated model is specific to the system that has been calibrated and validated, so it is necessary to calibrate and validate a model for each case.
The objective of this work was to develop a photovoltaic simulation model, called I-Solar, which allows us to obtain accurate generated powers in real time even in oversized systems. The model considers the integration and calculation of irradiance on an inclined surface from irradiance on a horizontal surface and an accurate simulation of the photovoltaic generator by the integration of different models already developed and validated. This manuscript highlights the importance of the proper simulation of the VFD, the efficiency of which is usually overestimated, as well as integrating the simulation of the rest of the system elements. Also, the possibility of simulating different control algorithms contributes in the decision making process about the adequate control of the PV system and its components (primarily the VFD). No references have been found that addressed all these issues in the same model. I-Solar is a model that can be applied to any PV system at any scale, allowing a more accurate estimation of the PV generation.
This parametric model has been compared with one of the simplified models used as well as nonparametric models generated based on machine learning (AI-Solar model). The methodology and proposed models were calibrated and validated in a high-power and complex photovoltaic pumping system for irrigation in Albacete, Spain.

The Case Study
To analyze, calibrate, and validate the developed models in the present work, they were applied to an irrigated farm named "Peruelos". It is located in the southeast of Albacete, Spain (latitude 38.994 • , longitude 1.859 • ). The irrigated area was approximately 90 ha, and contains almond trees growing in a 7 × 7 m 2 spacing. The irrigation system in the plot was subsurface drip irrigation energized by a photovoltaic system. The irrigation system had 20 sectors with a highly irregular shape and topography, with elevation differences of up to 60 m.
Energy was provided by a photovoltaic generator, which was composed of 152 polycrystalline silicon photovoltaic modules with 60 solar cells in each module ( Table 1). The photovoltaic module model was SM6610P 265 (Astronergy/Chint Solar, Frankfurt, Germany). The layout comprised eight lines in parallel with 19 photovoltaic modules per line. The total installed power was 40 kWp, with a unit capacity per photovoltaic module of 265 Wp. It had a VFD with a nominal power of 30 kW. The photovoltaic modules were oriented to the south with a slope of 8.5 • . The variable frequency drive (VFD) installed was 3G3RX-A4220-E1F (Omron Europe B.V., Hoofddorp, Netherlands), with an output nominal current of 57 A and an overvoltage protection of 800 V. The VFD efficiency, according to the manufacturer, was 89.7% at 25% load and 95% at 100% load.

Equipment and Systems for Data Acquisition: Monitoring
To simulate the power generated by the photovoltaic generator, the irradiance values on the horizontal surface were measured with a Middleton EP07/134 calibrated pyranometer (Middleton Solar, Melbourne, Australia), while temperature ( • C), wind speed (m·s −1 ), atmospheric pressure (hPa), and precipitation (mm) were measured with an agro-climatic station SICO WS-600 (SICO Control Systems, Madrid, Spain). These instruments were located next to the photovoltaic generator ( Figure 1a). The generated DC power was measured using an electrical network analyser PEL 103 (Chauvin Arnoux, Paris, France), while the generated AC power was measured using an AR5 electrical network analyser (CIRCUTOR, Barcelona, Spain) ( Figure 1b). Both analysers had an accuracy of better than 1.5%. With this information, the efficiency of VFD can be obtained for any generated power [50]. The equipment used for system monitoring was programmed to record the measurements every 10 min during 2016, 2017, and 2018. struments were located next to the photovoltaic generator ( Figure 1a). The generated DC power was measured using an electrical network analyser PEL 103 (Chauvin Arnoux, Paris, France), while the generated AC power was measured using an AR5 electrical network analyser (CIRCUTOR, Barcelona, Spain) ( Figure 1b). Both analysers had an accuracy of better than 1.5 %. With this information, the efficiency of VFD can be obtained for any generated power [50]. The equipment used for system monitoring was programmed to record the measurements every 10 min during 2016, 2017, and 2018.

Simulation Models of Photovoltaic Power Generation
The generated model, called I-Solar, was developed in MATLAB ® (Mathworks Inc., Natick, MA, USA), which integrates the developments of different authors and proposes new features. The I-Solar model allows simulation of the power generation of photovoltaic solar installations in real time, which is useful not only in photovoltaic pumping systems but also for any application of this type of energy. The results of the I-Solar model were compared with the results obtained from a simplified model which is commonly used. In addition, we developed a methodology for the accurate characterization of photovoltaic solar energy generation systems based on machine learning, called AI-Solar. The parametric models developed (Simplified and I-Solar) require the calculation of the irradiance on an inclined surface, while the AI-Solar model directly uses the measured irradiance on a horizontal surface. The three approaches were compared.

Calculation of Irradiance on Inclined Surface
The components of irradiance on an inclined surface were obtained from the components of global irradiance on a horizontal surface (GHI) [34] measured with a pyranometer in (W·m −2 ). The Direct Insolation Simulation Code (DISC) estimation model [51], improved by [52], allows the calculation of the direct normal irradiance on the horizontal surface (DNI). The DNI was corrected by applying the cosine of the solar zenith angle.

Simulation Models of Photovoltaic Power Generation
The generated model, called I-Solar, was developed in MATLAB ® (Mathworks Inc., Natick, MA, USA), which integrates the developments of different authors and proposes new features. The I-Solar model allows simulation of the power generation of photovoltaic solar installations in real time, which is useful not only in photovoltaic pumping systems but also for any application of this type of energy. The results of the I-Solar model were compared with the results obtained from a simplified model which is commonly used. In addition, we developed a methodology for the accurate characterization of photovoltaic solar energy generation systems based on machine learning, called AI-Solar. The parametric models developed (Simplified and I-Solar) require the calculation of the irradiance on an inclined surface, while the AI-Solar model directly uses the measured irradiance on a horizontal surface. The three approaches were compared.

Calculation of Irradiance on Inclined Surface
The components of irradiance on an inclined surface were obtained from the components of global irradiance on a horizontal surface (GHI) [34] measured with a pyranometer in (W·m −2 ). The Direct Insolation Simulation Code (DISC) estimation model [51], improved by [52], allows the calculation of the direct normal irradiance on the horizontal surface (DNI). The DNI was corrected by applying the cosine of the solar zenith angle. Subsequently, the diffuse irradiance on the horizontal surface (DHI) was obtained using Equation (1).
where GHI is the global irradiance on the horizontal surface, DNI is the direct normal irradiance on the horizontal surface, DHI is the diffuse irradiance on the horizontal surface, and θ z is the solar zenith angle. Obtaining the direct normal irradiance on an inclined surface (DNI T ) is based on a geometric approach, which depends on the inclination and orientation angles of the PV generator and of the solar coordinates with Equation (2).
where ξ is the incidence angle of the sun rays on the inclined surface and σ z is the solar zenith angle. The diffuse irradiance on an inclined surface (DHI T ) was obtained through the proposed model by [33], while the reflected irradiance on the inclined surface (RI T ) was obtained through Equation (3).
where β is the angle of the inclined surface and ρ is the albedo. The total irradiance on an inclined surface (GI T ) corresponds to the sum of direct normal irradiance (DNI T ), diffuse irradiance (DHI T ), and reflected irradiance (RI T ) values as in Equation (4).

Description of the Simplified Model to Determine the Generated Power
Before describing the developed model, we detail the most commonly simplified model [43]. In this model, the short circuit current (I SC ) is obtained using Equation (5).
where G is the irradiance (W·m −2 ), and I SC,STC the short-circuit current in standard test conditions (STC) ( Table 1). The cell temperature (T C , in • C) was estimated using Equation (6).
where T a is the ambient temperature ( • C), NOCT is the nominal operating cell temperature ( • C), and G is the irradiance (W·m −2 ). The open-circuit voltage (V OC , in V) is calculated using Equation (7). [53,54], and the normalised cell voltage (v oc , in V), are calculated using Equations (8) and (9): The ideal cell fill factor (FF O ) without considering the series resistance [36] is calculated using Equation (10). The normalised resistance (r s ) is calculated using Equation (11), considering the fill factor in standard conditions (FF STC ): The voltage (V MAX ) and current (I MAX ) at the point of maximum power are obtained through Equations (12)-(15): where a and b are coefficients to determine V MAX and I MAX .
The maximum power achieved in the photovoltaic generator (Pow MAX_G ) is obtained using Equation (16). (16) where N ms is the number of modules in series, N cs is the number of cells in series, N mp is the number of modules in parallel, and N cp is the number of cells in parallel. In the simplified model, to calculate the VFD efficiency (η VFDSimplified ), the polynomial of the VFD efficiency function of the inlet power is used (Equation (17)).
where POW AC is the VFD output power in alternate current, POW DC is the VFD inlet power in direct current, and pow AC = POW AC /POW VFD , and POW VFD is the VFD nominal power. The parameters k 0 , k 1 , and k 2 are coefficients of characteristic losses for VFD that correspond to mean values obtained by [55] from a representative sample of existing inverters in the market. To calculate losses in the cables, the same approach as described in the I-Solar model methodology is used.

Simulation Model Proposed, I-Solar
The main novelties of this proposed model are: • Implementation of a more accurate electrical model for the performance of PV cells and, therefore, of the modules. • Implementation of a control algorithm that considers PV oversizing effects on the working point of the PV generator, rather than only the performance when working at the maximum power point.

•
Determination of the cell temperature using not only the ambient temperature, but also the wind speed, considering cooling effect by convection. • Determination of power losses produced in cables for all calculation stages (Table 2). • Determination of the yearly ageing of photovoltaic modules through a linear method based on values provided by the manufacturer.

•
Characterization of the VFD efficiency curve through measured real values rather than the values provided by the manufacturer.
The parameters that characterize the photovoltaic modules to accurately simulate the generated power are not always provided by the manufacturer [45], but are determined by testing laboratories. In this case study, these parameters were provided by the presti-gious laboratories US Sandia National Laboratories and US National Renewable Energy Laboratory (NREL) for the photovoltaic module considered (Table 3). The accurate performance of the photovoltaic generator according to the variability of the environmental conditions [56] has been described in Equation (18) because the STC can rarely be found in real-life situations [57,58].
where I L is the light generated current, I O is the dark saturation current, R S is the series resistance, R sh is the parallel resistance and "a" is the ideality factor. The cell temperature was estimated in I-Solar using the method developed in [59], as described in Equation (19).
where T c is the cell temperature ( • C), T m is the photovoltaic module temperature ( • C), E is the incident solar irradiance on the photovoltaic module surface (W·m −2 ), E o is the reference solar irradiance in the photovoltaic module (1000 W·m −2 ), and ∆T is the temperature difference between the cell and the back surface of the photovoltaic module. The general estimation of power losses in cables in DC is based on the voltage drop approach as in Equation (20).
where VD is the voltage drop (V), L is the length of cable (m), I is the current in cable (A), σ is the conductivity of the material (m·Ω −1 ·mm −2 ), s is the cross-section area of the cable (mm 2 ), and cosϕ = 1 in DC. However, for AC cables, with a large length in this case, the power loss estimation is based on the cable resistance approach obtained according to the cable temperature reached (Equation (21)).
where CL POW is the power losses in the cable (kW), I max is the AC current in the cable (A), N is the number of conductors, L is the length of the cable (m), and R is the resistance according to the temperature reached (Ω). Usually, photovoltaic generators for solar pumping are oversized to guarantee the irrigation time, overcome highly variable irradiance, and compensate for the ageing of the modules, among other reasons. However, this oversizing has effects in their operation, which must be considered when simulating the system. The current photovoltaic models obtain results while working at the nominal point of maximum power (MPP N ) of the Current-Voltage curve. However, oversizing of the photovoltaic generators causes the generator to work out of the MPP N , depending on the characteristics of the electrical system that limits the generator (Figure 2a,b). The actual working point (AWP) is controlled by the VFD. In the I-Solar model, a control algorithm of the power generated that considers generator oversizing can be used in the simulation process.
tion, which must be considered when simulating the system. The current photovoltaic models obtain results while working at the nominal point of maximum power (MPPN) of the Current-Voltage curve. However, oversizing of the photovoltaic generators causes the generator to work out of the MPPN, depending on the characteristics of the electrical system that limits the generator (Figure 2a,b). The actual working point (AWP) is controlled by the VFD. In the I-Solar model, a control algorithm of the power generated that considers generator oversizing can be used in the simulation process. With the I-Solar model, these aspects have been considered, suggesting a clear improvement over current existing models. A new approach is provided to improve the model, which consists of obtaining a control algorithm for the generated power to determine the power that is provided by the photovoltaic generator in real operating conditions. To do this, the algorithm distinguishes between two zones with different operating behaviours. The first is where the photovoltaic power obtained is higher than that of the AWP, while the second is located between the AWP and the nominal power of VFD (POWVFD).
The control algorithm for the generated power continually checks the power generated by the photovoltaic generator to determine the zone where the value is located, and calculates the photovoltaic power in STC using the real irradiance at any given moment. Subsequently, Equation (22) establishes a relationship between the STC power achieved and the power corresponding to the study zone through use of the dimensionless coefficient "CA" which allows readjustment of the intensity according to the corresponding voltage value of the I-V curve defining the real maximum power achieved at each moment.
where POWSTC_i is the achieved power in STC according to the real irradiance at instant "i", and POWS_A is the maximum power achieved according to the study zone. With the I-Solar model, these aspects have been considered, suggesting a clear improvement over current existing models. A new approach is provided to improve the model, which consists of obtaining a control algorithm for the generated power to determine the power that is provided by the photovoltaic generator in real operating conditions. To do this, the algorithm distinguishes between two zones with different operating behaviours. The first is where the photovoltaic power obtained is higher than that of the AWP, while the second is located between the AWP and the nominal power of VFD (POW VFD ).
The control algorithm for the generated power continually checks the power generated by the photovoltaic generator to determine the zone where the value is located, and calculates the photovoltaic power in STC using the real irradiance at any given moment. Subsequently, Equation (22) establishes a relationship between the STC power achieved and the power corresponding to the study zone through use of the dimensionless coefficient "C A " which allows readjustment of the intensity according to the corresponding voltage value of the I-V curve defining the real maximum power achieved at each moment.
where POW STC_i is the achieved power in STC according to the real irradiance at instant "i", and POW S_A is the maximum power achieved according to the study zone.
In the case where the power obtained during the photovoltaic simulation is not located in the zones affected by oversizing, the algorithm does not provide any change in the initial power calculated. Therefore, the model I-Solar is prepared to detect both conditions automatically, with and without oversizing.
The manufacturers of photovoltaic modules guarantee a useful life of 25−30 years. However, each year the photovoltaic modules degrade with age, affecting their global efficiency. In this model, a polynomial function has been implemented (Equation (23)), based on values provided by the manufacturer of % efficiency loss between the first and final years of useful life guarantee, which allows the annual power losses in the photovoltaic modules to be calculated. Thus, in the model I-Solar it is only necessary to select the year of useful life of the photovoltaic modules (x) to obtain results.
The efficiency of the VFD was determined under highly variable environmental conditions (Equation (24)). With the values obtained, an adjustment with a second-degree polynomial was carried out. This parameter, which can only be obtained by measuring the power in DC and AC (before and after the VFD), is a key factor for the accurate simulation of solar pumping systems.
where η VFD is the efficiency of VFD (%), POW AC is the output power of VFD in AC (kW), and POW DC the inlet power of VFD in DC (kW).

Simulation Model Based on Machine Learning, AI-Solar
Different types of machine learning were evaluated using massive data captured with the monitoring system described above. The Classification Learner application of Matlab ® (Mathworks Inc., Natick, MA, USA) as well as the NETLAB library [60] for artificial neural networks in Matlab ® were used with the aim of determining the most appropriate algorithm. The inlet variables in the model were global irradiance on the horizontal surface (W·m −2 ), ambient temperature ( • C), and wind speed (m·s −1 ). The output variable was the power generated in AC (W). Table 4 shows the machine learning evaluation. Data were split into calibration (85%) and testing data (15%). In the calibration process, a cross-validation was performed using 5 folds. To select the best predictive model, the model performance was evaluated using testing data.

Statistical Analysis of the Evaluated Models
To analyze the goodness of fit of the model, a statistical analysis was performed based on the calculation of the root mean square error (RMSE), relative error (RE), and coefficient of determination (R 2 ). Additionally, the adjustment to the normal of the residuals and homoscedasticity were also evaluated.

Results and Discussion
The results first show the statistical adjustment of the different models (simplified, I-Solar, and AI-Solar) to the measured data. Subsequently, the main variables that affect the performance of the models (cell temperature, control algorithm, and VFD efficiency) are analyzed.

Analysis of the Regression Models Based on Artificial Intelligence, AI-Solar Model
To determine the type of machine learning that represents the best operation of the photovoltaic system, the main statistical results were obtained for each of the machine learning types described in the methodology (Table 5). The most accurate machine learning type was a Gaussian process regression (GPR) with exponential kernel function, with a relative error of 5.8%, a RMSE of 1226.7 W, and an R 2 of 0.95. The poor performance of the linear regression models shows the complexity and non-linearity of the analyzed problem. The same result is observed when using a linear kernel in the SVM algorithms. However, the remaining algorithms that use non-linear kernels show adequate performance.
The implementation of machine learning algorithms requires the acquisition of massive data to calibrate and validate the model. Although the performance of the model is highly accurate, the generated model is specific to the system analyzed and cannot be applied to other photovoltaic systems.

Operation Analysis of the Developed Models
A comparison between the power values measured with the electrical network analyser at output VFD (in AC) and the evaluated models is shown in Figure 3: (1) simplified model, (2) I-Solar model, and (3) AI-Solar model. The statistical analysis is shown in Table 6. The developed AI-Solar model was trained with GPR-Exponential (GPR-E), which results in greater precision.   Of the models evaluated, the model based on machine learning, AI-Solar, shows the best adjustment. Models I-Solar and AI-Solar offer a clear improvement over the simplified model, decreasing the relative error of 14.2% for the simplified model to 7.5% and 5.8% for the proposed I-Solar and AI-Solar (artificial intelligence) models, respectively.
In addition, the results of the simplified model show a high spread of errors due to the overestimation both of the output power of the photovoltaic generator in DC and of the VFD, which was calculated based on empirical coefficients that led to the fixing of the maximum possible power of the VFD at 30 kW.
To determine the potential sources of error in the simplified model with respect to the parametric model I-Solar, the main variables that affect the simulation power have been analyzed, such as cell temperature, control of the generated power, and actual performance of the VFD.

Wind Speed Effect in Cell Temperature
The cell temperature estimated by the I-Solar model is lower (Figure 4) because it considers wind speed and the resulting temperature decrease due to convection processes. It is observed that the largest differences are obtained for medium-high cell temperature values, with a maximum difference of up to 11 • C. The overestimation of the cell temperature can lead to an underestimation of the generated power. However, it has been concluded in Section 3.2. that the simplified model tends to overestimate the generated power, which confirms the lack of precision that the simplified model has in global terms. If a more accurate cell temperature was estimated by the simplified model, even higher inaccuracies of the model would have been apparent.

Wind Speed Effect in Cell Temperature
The cell temperature estimated by the I-Solar model is lower (Figure 4) because it considers wind speed and the resulting temperature decrease due to convection processes. It is observed that the largest differences are obtained for medium-high cell temperature values, with a maximum difference of up to 11 °C. The overestimation of the cell temperature can lead to an underestimation of the generated power. However, it has been concluded in Section 3.2. that the simplified model tends to overestimate the generated power, which confirms the lack of precision that the simplified model has in global terms. If a more accurate cell temperature was estimated by the simplified model, even higher inaccuracies of the model would have been apparent.

Effect of the Generated Power Control Algorithm
In common practical situations where the photovoltaic generator is oversized in the design phase, the generated power control algorithm results in an accurate estimation of the generated power for high irradiance values. In this case study, as an example for a representative period in summer (June, July, and August), this difference can reach up to 6.5 kW. Figure 5 presents the statistical adjustment, and Table 7 shows the statistical analysis, of the relationships between measured power and simulated power with the I-Solar model just downstream of the photovoltaic generator (in DC). The results indicate that, despite the slight differences in RMSE and R 2 values compared to Table 6 (in AC), which may be due to the different precision of the measurement devices, there is a significant improvement in the relative error (RE).
As can be seen in Figure 5, high power values show a non-linear behaviour that reflects the zonal differentiation of the generated power control algorithm mentioned in the Simplified Model (ºC)

Effect of the Generated Power Control Algorithm
In common practical situations where the photovoltaic generator is oversized in the design phase, the generated power control algorithm results in an accurate estimation of the generated power for high irradiance values. In this case study, as an example for a representative period in summer (June, July, and August), this difference can reach up to 6.5 kW. Figure 5 presents the statistical adjustment, and Table 7 shows the statistical analysis, of the relationships between measured power and simulated power with the I-Solar model just downstream of the photovoltaic generator (in DC). The results indicate that, despite the slight differences in RMSE and R 2 values compared to Table 6 (in AC), which may be due to the different precision of the measurement devices, there is a significant improvement in the relative error (RE).

Influence of the Variable Frequency Drive (VFD) Efficiency
One of the key issues in accurately simulating a photovoltaic system to energize a pump using a VFD is estimating the efficiency of the VFD for any irradiance condition. In this case study, the adjustment curve of the VFD efficiency for the I-Solar model (ηVFD I-Solar) is shown in Figure 6 and Equation (25), using a second-degree polynomial equation depending on the output power in the VFD (x).
ηVFD I-Solar = −0.0042·x 2 + 1.0876·x + 4.4977 (25) This methodology has been compared with parameters supplied by the VFD manufacturer. Figure 6 shows that the measured efficiency of the VFD at 30 kW reached a value of 90 % while the value supplied by the manufacturer was close to 95 %. All measured efficiency values were much lower than the efficiency specified by the manufacturer, with higher differences for low powers (consequently low frequencies). The large difference at low power values is significant. For example, at 7.5 kW, the manufacturer indicated a VFD efficiency of 89 %, while the measured values only achieved a VFD efficiency of 57 %. The high temperatures reached at the VFD locations is an important factor. It is important to consider that this equipment mainly works in the middle of the day in summer, where the temperature values inside the VFD electrical housing can become very high, even with adequate ventilation. The differences found indicate that it is necessary to characterize the actual efficiency of the VFD under real working conditions. In addition, it highlights the necessity of establishing an improved standard to determine the efficiency of these devices in non-ideal working conditions, primarily at high temperatures.
The comparison curve between the VDF efficiency supplied by the manufacturer and the VFD efficiency calculated for the simplified method, which was simulated using the general equation described in Section 2.3.2., is shown in Figure 6. The simulated values are seen to always be below the curve of the manufacturer, with an antagonistic trend with regard to measured values, and with an approximation at low powers, around 89 % at 15.65 kW, while there is a decrease in the curve at medium power values which continues at high power values with a final efficiency of 86.5 % at an output power of 30 kW. Thus, considering simplified solutions to take into account the VFD efficiency could lead to large inaccuracies in the final model.  As can be seen in Figure 5, high power values show a non-linear behaviour that reflects the zonal differentiation of the generated power control algorithm mentioned in the methodology section, which is also reflected in the measured values. The absence of their application in scenarios of oversizing leads to unfeasible results for any photovoltaic installation.

Influence of the Variable Frequency Drive (VFD) Efficiency
One of the key issues in accurately simulating a photovoltaic system to energize a pump using a VFD is estimating the efficiency of the VFD for any irradiance condition. In this case study, the adjustment curve of the VFD efficiency for the I-Solar model (η VFD I-Solar ) is shown in Figure 6 and Equation (25), using a second-degree polynomial equation depending on the output power in the VFD (x).
η VFD I−Solar = −0.0042·x 2 + 1.0876·x + 4.4977 (25) Agronomy 2021, 11, x 16 of 20 Figure 6. Comparison between the VFD efficiency supplied by the manufacturer, the VFD efficiency measured and the VFD efficiency obtained through the general equation of the simplified method.

Conclusions
This work examines fundamental issues to consider in the modelling of photovoltaic solar energy that are useful in the process of decision making for end users of this tech-VFD Efficiency (%) Figure 6. Comparison between the VFD efficiency supplied by the manufacturer, the VFD efficiency measured and the VFD efficiency obtained through the general equation of the simplified method. This methodology has been compared with parameters supplied by the VFD manufacturer. Figure 6 shows that the measured efficiency of the VFD at 30 kW reached a value of 90% while the value supplied by the manufacturer was close to 95%. All measured efficiency values were much lower than the efficiency specified by the manufacturer, with higher differences for low powers (consequently low frequencies). The large difference at low power values is significant. For example, at 7.5 kW, the manufacturer indicated a VFD efficiency of 89%, while the measured values only achieved a VFD efficiency of 57%. The high temperatures reached at the VFD locations is an important factor. It is important to consider that this equipment mainly works in the middle of the day in summer, where the temperature values inside the VFD electrical housing can become very high, even with adequate ventilation. The differences found indicate that it is necessary to characterize the actual efficiency of the VFD under real working conditions. In addition, it highlights the necessity of establishing an improved standard to determine the efficiency of these devices in non-ideal working conditions, primarily at high temperatures.
The comparison curve between the VDF efficiency supplied by the manufacturer and the VFD efficiency calculated for the simplified method, which was simulated using the general equation described in Section 2.3.2, is shown in Figure 6. The simulated values are seen to always be below the curve of the manufacturer, with an antagonistic trend with regard to measured values, and with an approximation at low powers, around 89% at 15.65 kW, while there is a decrease in the curve at medium power values which continues at high power values with a final efficiency of 86.5% at an output power of 30 kW. Thus, considering simplified solutions to take into account the VFD efficiency could lead to large inaccuracies in the final model.

Conclusions
This work examines fundamental issues to consider in the modelling of photovoltaic solar energy that are useful in the process of decision making for end users of this technology.
The complexity in the development of photovoltaic system tools and models is affected by the strong influence of the parameters involved during the process. Collecting large amounts of real-world measured values have allowed us to describe the performance of the photovoltaic installation, allowing the generation of an accurate and sturdy I-Solar model that efficiently integrates all influential stages in photovoltaic production through the application of methodologies from different studies as well as new developments in this study. The model has been validated with measured values to manage the system in real time.
Although solar pumping systems are discussed in this work, this model can be integrated into any type of photovoltaic system, such as solar pumping systems or grid connected systems, which provide high versatility and utility.
The parametric I-Solar model developed to determine the electric power generated in photovoltaic solar systems has allowed us to obtain a similar precision (RE = 7.5%) to the non-parametric model (RE = 5.8%) based on machine learning. The simplicity of the parametric simplified model results in a clear lack of precision (RE = 14.2%). In addition, parametric models are of general application for any photovoltaic solar installation, while the non-parametric models are specific to each installation and require a large number of values derived from the monitoring systems.
Solar radiation values, which are the base element for the calculation of the photovoltaic production, are essential to achieve good integration in the calculation of the irradiance components that allows us to obtain the irradiance on an inclined surface in real time.
The effect of wind speed on the cell temperature has been demonstrated. It decreases the temperature by convection processes and consequently increases the module efficiency. Thus, the I-Solar generated model includes this variable compared to the simplified model.
The oversizing of the photovoltaic generator contributes to an increase in the generated power. However, it has been demonstrated that the incorporation of the generated power control algorithm in installations with these characteristics is essential to determine with precision the power used in real time in order to manage the system appropriately.
The need to characterize the VFD efficiency in real time has been demonstrated because of the significant differences between values supplied by the manufacturer and the measured values, which affect the generated power and consequently the system management.
The current capacity for monitoring photovoltaic systems allows us to obtain large amounts of performance data from these systems that can be used in machine learning to establish accurate relationships with the final generation of electric energy. Most of these studies are focused on the prediction of photovoltaic energy generation, while the objective of this work is to determine the power generated in real time. In addition, these models are useful for generating alarms when the measured power is significantly different to the power simulated by the system.
The level of precision obtained with the I-Solar model can be useful in decisionmaking to determine the optimal technical and economic specifications of photovoltaic solar installations. Funding: This research was funded by the Spanish Ministry of Education and Science (MEC), grant number AGL2017-82927-C3-2-R (co-funded by FEDER) and the University of Castilla-La Mancha predoctoral grant action. We would like to thank Juan José Toboso, the owner of the Peruelos farm, for his unlimited help to us when carrying out this work. We would also like to thank the U.S. Sandia National Laboratories and the U.S. National Renewable Energy Laboratory (NREL) for their support and information, which made it possible to develop these models.