Artiﬁcial Neural Networks for the Prediction of the Reference Evapotranspiration of the Peloponnese Peninsula, Greece

: The aim of the study was to investigate the utility of artiﬁcial neural networks (ANNs) for the estimation of reference evapotranspiration (ETo) on the Peloponnese Peninsula in Greece for two representative months of wintertime and summertime during 2016–2019 and to test if using fewer inputs could lead to satisfactory predictions. Datasets from sixty-two meteorological stations were employed. The available inputs were mean temperature (Tmean), sunshine (N), solar radiation (Rs), net radiation (Rn), vapour pressure deﬁcit (es-ea), wind speed (u 2 ) and altitude (Z). Nineteen Multi-layer Perceptron (MLP) and Radial Basis Function (RBF) models were tested and compared against the corresponding FAO-56 Penman Monteith (FAO PM) estimates of a previous study, via statistical indices. The MLP1 7-2 model with all the variables as inputs outperformed the rest of the models (RMSE = 0.290 mm d − 1 , R 2 = 98%). The results indicate that even ANNs with simple architecture can be very good predictive models of ETo for the Peloponnese, based on the literature standards. The MLP1 model determined Tmean, followed by u 2, as the two most inﬂuential factors for ETo. Moreover, when one input was used (Tmean, Rn), RBFs slightly outperformed MLPs (RMSE < 0.385 mm d − 1 , R 2 ≥ 96%), which means that even a sole-input ANN resulted in satisfactory predictions of ETo.


Introduction
Reference Evapotranspiration (ETo) is a key climate parameter investigated in the frame of climate crisis and water resources management [1][2][3][4]. Moreover, ETo has an effect on the productive sector of agriculture [5]. In particular, the precision irrigation techniques and the decision-making irrigation systems demand accurate ETo values [6]. Therefore, the importance of effective estimation and prediction methods of ETo is crucial [7,8].
The necessity of acquiring ETo values led to the development of several methods of estimation, ranging from simple empirical or physically based models [9,10] to complex neuro-fuzzy and machine learning algorithms . The methods incorporate data from meteorological stations or, due to the scarcity of the former, remotely sensed data [48][49][50][51][52][53][54][55][56][57][58]. The FAO-56 Penman Monteith (FAO PM) equation requires numerous meteorological variables for effective application [59,60]. Even the more refined empirical methods of ETo, such as Valiantzas' equations [61], share a common denominator; the more meteorological inputs, the higher the accuracy of ETo [62]. On the other hand, there are empirical methods requiring limited climatic variables (e.g., the Hargreaves'-Samani equation). However, the accuracy of these methods, compared to FAO PM, is limited and performance in some cases has been reported to be season-dependent [63]. The unavailability of input data is a global issue, due to the high cost of equipping and running meteorological stations, especially for developing countries. Thus, reducing the number of inputs for the predicting models to only trivial parameters is highly recommended [64]. Those parameters can be air temperature (minimum, maximum and mean values) plus parameters that can literature standards, and to indicate the most influential factors on ETo. This study examines nineteen MLP and RGB (ANN) models utilizing sixty-two meteorological stations in the Peloponnese peninsula in Southwestern Greece (Table A1). The Peloponnese peninsula is a study area with distinguished differences over short distances. Empirical methods applied across the area in a previous study [63] showed inferior performance in terms of accuracy, especially for August (summertime). A further objective was to examine whether fewer input variables would produce satisfactory predictions of ETo in terms of accuracy. Therefore, the number of inputs of the models were consecutively reduced to examine whether fewer than the seven available inputs, or even a sole input, could predict ETo acceptably. The novelty of the study is twofold: it is the first time that ANNs are used to predict the ETo across a large area of Southern Greece, which includes elevation, relief and LU/LC interchanges that are challenging in terms of the consistency of model performance. Even more challenging is the usage of continuous climate datasets over a recent four-year period, instead of datasets for individual days or short intervals. In addition, the exploitation of a dense meteorological station network, instead of a couple of stations, enhances the significance of the results. A good performance from ANNs, despite the aforementioned difficulties, would prove the flexibility and the potential of ANNs in ETo modeling. The latter would be useful in cases where there is a shortage of climate data. Moreover, it could save time and decrease computational load for interdisciplinary research purposes.

The Study Area
The Peloponnese is a peninsula in Southwestern Greece that occupies about 1/6th of the Greek territory (21,439 km 2 ), with a population of 1086.935 (census 2011; https: //www.statistics.gr/el/statistics/-/publication/SAM03/2011 (accessed on 10 March 2022). A large part is covered by high hills and mountains, running NW to SE, with an elevation up to 2407 m. Lithology, tectonic activity and climate conditions have resulted in the relief formation of the study area. A well-developed hydrographic network, though with few large rivers, has formed [78]. Based on the latest Copernicus LU/LC classification, the widest urban area is located at the northmost edge [79]. In addition to urban areas, the main LU/LC types are forest and transitional vegetation, as well as crop plots covering the plains (Figure 1). The broadest plain lies over the western coastal part. According to Köppen-Geiger's classification, the climate of the Peloponnese is Mediterranean warm temperate with dry summers and mild winters (Csa) [80]. The annual normal measurements  of air temperature, precipitation and sunshine range between 8-20 • C, 400 to over 2000 mm and 1900-3100 h, respectively (http://climatlas.hnms.gr/sdi/?lang=EN (accessed on 27 April 2022).

Methods
Meteorological datasets of daily measurements from sixty-two stations under the National Observatory of Athens, for the months August and December of 2016-2019, were utilized (Table A1). These months and years were selected for the application of ANNs in methodological consistency with our previous study [63], where ETo was computed for the Peloponnese by FAO PM, our reference method. In that study, August and December had been selected as typical months of summer and winter, respectively. FAO PM has been widely used as a reference method to estimate ETo in studies investigating ANNs, either as the only reference method [10,70,81] or combined with direct methods [82]. Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) ANN models were examined in this study. Nineteen combinations of input variables were tested (from seven inputs to one input) with both MLP and RBF models. The inputs included climate and non-climate (such as altitude) variables that affect ETo [83], namely, solar radiation (Rs), net radiation (Rn), sunshine hours (N), mean air temperature (Tmean), vapour pressure deficit (es-ea), wind speed at 2 m from the surface (u 2 ) and altitude (Z). The first three were previously calculated as functions of the station latitude and the Julian day, as in Zanetti et al. (2008) [36]. Based on Rahimikhoob (2020) [84], the input combinations were subsequently limited, aiming to explore the possibility of producing acceptably accurate predictions of ETo with fewer variables than the available seven or even with a sole input variable (such as Tmean or Rn). About three fifths of the sample data were used for training and the other two fifths were used for testing and validation (about one fifth each) [40]. The tested architecture was based on the trial-and-error method and the ANNs were trained using the Levenberg-Marquardt algorithm [40,41]. Hyperbolic tangent (Equation (1); [85]) was the utilized activation function, based on the literature [71,86]. The hyperbolic tangent, along with the sigmoid function, is a non-linear function widely used as an activation function in ANNs [85]. The hyperbolic tangent exhibits the advantage of giving higher enhancement to the negative values. The output of the former spans in [−1, 1] while the sigmoid outputs are only half of the previous ([0, 1]) [87].
The values predicted by the ANN models were then compared against the values by FAO PM via statistical indices that were computed during the training, testing and validation phases (SSE, RE) ( Table 1). Furthermore, the error levels between predicted and reference values have been computed via measures such as Root Mean Square Error (RMSE, mm d −1 ), Normalized Root Mean Square Error (NRMSE, %), Mean Absolute Error (MAE, mm d −1 ), Mean Bias (MB, mm d −1 ) and Sum of Squares Error (SSE, mm 2 d −1 ) ( Table 1). The error values depict the magnitude of the computed ETo values, except from NRMSE, which is expressed in *100%. Mean Bias is a signed measure so that, in addition to producing the Mean Bias Error (MBE = |MB|), it provides some extra information; the minus sign depicts that the reference ETo value (by FAO PM) is greater than the model predicted value and vice versa. Moreover, three measures that express correlation, strength of fit or agreement were used, namely, Pearson correlation coefficient (Pearson's r), coefficient of determination (R 2 ) and Index of Agreement (IoA) ( Table 1). Table 1. Formulae of the indices utilized to evaluate the performance of the examined ANN models.

Performance Evaluating Indices
In Table 1, p i stands for the ith predicted value by an ANN, o i stands for the ith observed value, which in this study is the ith reference value estimated by FAO PM,

Methods
Meteorological datasets of daily measurements from sixty-two stations under the National Observatory of Athens, for the months August and December of 2016-2019, were utilized (Table A1). These months and years were selected for the application of ANNs in methodological consistency with our previous study [63], where ETo was computed for the Peloponnese by FAO PM, our reference method. In that study, August and December had been selected as typical months of summer and winter, respectively. FAO PM has been widely used as a reference method to estimate ETo in studies investigating ANNs, either as the only reference method [10,70,81] or combined with direct methods [82].

Results
In total, nineteen models with different input combinations were tested. The results are presented in Table 2. As presented in Table 2, MLP1 7-2, with all the available parameters as inputs, appears to provide the best performance, expressed via the minimum error values in the testing and validation data and the maximum correlation and agreement indices (R 2 , Pearson's r and IoA) between prediction and reference values. The architecture is simple, consisting of one hidden layer with two neurons (Figure 2). The parameter-estimates of the MLP1 model are displayed in Table 3. As presented, the values of both the nodes and the hidden layers are not negligible values, a fact that constitutes one of the desirable characteristics of an ANN model. As presented in Table 2, MLP9 6-4-3 follows closely the model with the superior performance (MLP1), except for the testing SSE value, which is almost double the corresponding MLP1 value. It is interesting that the MLP10 4-3-2 model, with only four inputs and two hidden layers, exhibits a performance very close to that of MLP1, with a couple of values (MBE and SSE) even better than those of MLP1. The added value is that it produces almost the same values with only four basic parameters as inputs, specifically, Rs, es-ea, The parameter-estimates of the MLP1 model are displayed in Table 3. As presented, the values of both the nodes and the hidden layers are not negligible values, a fact that constitutes one of the desirable characteristics of an ANN model. As presented in Table 2, MLP9 6-4-3 follows closely the model with the superior performance (MLP1), except for the testing SSE value, which is almost double the corresponding MLP1 value. It is interesting that the MLP10 4-3-2 model, with only four inputs and two hidden layers, exhibits a performance very close to that of MLP1, with a couple of values (MBE and SSE) even better than those of MLP1. The added value is that it produces almost the same values with only four basic parameters as inputs, specifically, Rs, es-ea, u 2 and Tmean. The data of the aforementioned parameters are usually available either from meteorological stations or via remote sensing. Moreover, in the event of any missing data, the parameters can be easily computed [59]. It is obvious that the RBF network displays inferior performance to MLP for our data. RBF3 6-9 is the best among the RBF networks and fourth in the total rank of model performance. The relative errors of MLPs in the validation phase were between 2.2-4.2% for all trials, whereas for RBFs the relative errors lay between 2.8-10%. For the majority of the models, the holdout RE values were greater than the testing RE values, despite the satisfactory error levels (2.8-4.5%). This indicates that those models were overtrained towards the testing data.
The cases where only a sole basic parameter (Tmean or Rn) was set as input seem noteworthy. In those cases, the RBF models exhibited better performance than the MLP ones. RBF9 (Rn as input) obtained RMSE = 0.383 mm d −1 and R 2 = 96.5% with RE = 3.3%. In the case of Tmean as the sole input, the RMSE was around 0.360 mm d −1 and the R 2 was above 96.7%. However, as previously commented, there was evidence of overtraining of some models (Table 2). Therefore, those models are not recommended. The influence of each climatic variable on ETo presented is considered important in respect of the ANN that best fits the data and bears interesting determination [14]. As shown in Figure 3, the most influential factor is Tmean followed by u 2 . The third factor in the rank is vapour pressure deficit (es-ea). Those results are aligned with the ETo estimates by FAO PM of the same period for the Peloponnese [63]. u2 and Tmean. The data of the aforementioned parameters are usually available either from meteorological stations or via remote sensing. Moreover, in the event of any missing data, the parameters can be easily computed [59]. It is obvious that the RBF network displays inferior performance to MLP for our data. RBF3 6-9 is the best among the RBF networks and fourth in the total rank of model performance. The relative errors of MLPs in the validation phase were between 2.2-4.2% for all trials, whereas for RBFs the relative errors lay between 2.8-10%. For the majority of the models, the holdout RE values were greater than the testing RE values, despite the satisfactory error levels (2.8-4.5%). This indicates that those models were overtrained towards the testing data. The cases where only a sole basic parameter (Tmean or Rn) was set as input seem noteworthy. In those cases, the RBF models exhibited better performance than the MLP ones. RBF9 (Rn as input) obtained RMSE = 0.383 mm d −1 and R 2 = 96.5% with RE = 3.3%. In the case of Tmean as the sole input, the RMSE was around 0.360 mm d −1 and the R 2 was above 96.7%. However, as previously commented, there was evidence of overtraining of some models (Table 2). Therefore, those models are not recommended. The influence of each climatic variable on ETo presented is considered important in respect of the ANN that best fits the data and bears interesting determination [14]. As shown in Figure 3, the most influential factor is Tmean followed by u2. The third factor in the rank is vapour pressure deficit (es-ea). Those results are aligned with the ETo estimates by FAO PM of the same period for the Peloponnese [63].

Discussion
The determination of ETo is crucial for climate crisis research, hydrological cycle, water resources management and irrigation precision techniques. The available methods of estimation require the knowledge of numerous climate parameters, which means an increased cost. Since ETo includes complex and nonlinear relationships, ANNs have been proven to be a suitable modeling choice. In this study, ETo has been estimated via nineteen MLP and RBF ANN architectures, with different combination and number of inputs (from seven to one).
This period is interesting in the frame of the climate crisis, since it is recent and includes the first (2016) and second (2019) warmest years since the preindustrial era, which challenges the performance of ANNs.

Discussion
The determination of ETo is crucial for climate crisis research, hydrological cycle, water resources management and irrigation precision techniques. The available methods of estimation require the knowledge of numerous climate parameters, which means an increased cost. Since ETo includes complex and nonlinear relationships, ANNs have been proven to be a suitable modeling choice. In this study, ETo has been estimated via nineteen MLP and RBF ANN architectures, with different combination and number of inputs (from seven to one).
This period is interesting in the frame of the climate crisis, since it is recent and includes the first (2016) and second (2019) warmest years since the preindustrial era, which challenges the performance of ANNs.
For the majority of the trials, the RMSE was below 0.4 mm d −1 and the relative error of the testing phase was below 4%. MLPs performed generally better than RBFs for multiple inputs, whereas RBFs performed slightly better when only one input (Rn or Tmean) was set. The model that best fits the reference values was that with the most input parameters and only one hidden layer (MLP1 7-2), bearing RMSE = 0.290 mm d −1 , R 2 = 98% and RE of testing and validation phases equal to 2.7% and 2.2%, respectively. The different runs with the same input combination, but different model architecture, showed that any increase in the number of hidden layers and the number of neurons in the hidden layer exhibited negligible improvement in prediction accuracy. Those conclusions are in line with the findings by Tabari et al. (2012) [71]. The data of the seven used inputs can be easily derived, either by meteorological stations or via remote sensing, or can be easily computed as missing data by FAO guidelines [59]. However, during the trials, the number of the inputs was gradually limited in order to examine whether ANNs can provide satisfactory estimates of ETo, when incorporating only the most commonly available climate data. It is interesting that the MLP10 with four climate parameters as inputs exhibited results very close to the best model (MLP1). Moreover, models with two basic parameters as inputs, exhibited RMSE up to 0.352 mm d −1 , testing RE below 3% and R 2 at least equal to 97% (MLP6, MLP13). When only one input was used (i.e., Tmean or Rn) in RBF models, the RMSE was below 0.385 mm d −1 , testing RE was below 3.6% and R 2 was at least equal to 96%. According to the literature, those values are considered very good to excellent. For example, Rahimikhoob (2010) recommended an ANN for the coastal area of the Caspian Sea in Northern Iran, which used only air temperature as an input, with an RMSE = 0.41 mm d −1 and R 2 = 95% [84], whereas in this study MLP14 with Tmean as an input has better accuracy (RMSE = 0.360 mm d −1 and R 2 = 96.9%). In the same vein, Zanetti et al. (2008) used MLP ANN with only temperature and radiation inputs to predict ETo for Campos dos Goytacazes, Brazil [66], while Ravindran et al. (2021) deduced that Rs was the most influential parameter to ETo and used it as a sole input in ANNs for California (R 2 up to 95.4%) [14]. This proves that ANNs with simple architecture can be good predictive models of ETo over the Peloponnese for the examined period. In addition, based on the best ANN model (MLP1), we found that Tmean and u 2 were the two most influential factors on ETo, out of the seven examined. This is in line with the findings of a previous study of the same period for the Peloponnese, where ETo had been computed by FAO PM [63]. Tmean, as a proxy of the energy state of the system, is one of the most influential factors on ETo. This depicts the altitude and land cover difference over short distances across the Peloponnese. Probably, due to the fact that the Peloponnese has a very low variance in latitude, the radiation factors were not that influential for the determination of the ETo. Regarding the second most influential factor (u 2 ), in contrast with December (winter), increased values of u 2 are not frequent in Greece during August (summer). Therefore, where increased u 2 values occurred in August, they affected the determination of the local ETo values. In conclusion, ANNs resulted in predictions very close to FAO PM, which is the most established reference method, for the examined period for the Peloponnese. Therefore, ANNs present the potential of general usage in modeling ETo across Greece, after further investigation.
This study proves that ANNs can be useful alternatives for predicting ETo, requiring limited climate data as input. The former is of considerable usefulness in cases where climate data are in shortage or the cost of the meteorological stations is not affordable, such as in developing countries. Despite the cost, ANNs provide a time saving and a computational effort-decreasing alternative to the complex algorithms currently used, which is applicable for interdisciplinary research purposes.

Conclusions
Among the tested ANNs, MLPs performed generally better than RBFs for multiple inputs, whereas RBFs performed slightly better when one sole input (Rn or Tmean) was set. The results revealed better performance of the MLP1 7-2 model, with all the available variables as inputs and only one hidden layer, bearing an RMSE = 0.290 mm d −1 , R 2 = 98% and RE values of testing and validation phases equal to 2.7% and 2.2%, respectively. The former proves that even simple ANN architectures can constitute very satisfactory predicting models. Models with only two parameters as inputs exhibited RMSE values up to 0.352 mm d −1 and R 2 values at least equal to 97% (MLP6, MLP13). When one sole input was used (Tmean or Rn) in RBF models, RMSE was below 0.385 mm d −1 and R 2 was at least equal to 96%. The results in both cases are very satisfactory. The MLP1, which outperformed the rest of the ANNs, determined the order of importance of parameters that affect ETo. The first two most influential parameters were Tmean and u 2 . Tmean is commonly a parameter to which ETo variances are attributed, as it depicts the overall energy state of the system. For the Peloponnese, where the variance of the latitude (and consequently of solar radiation) values is minor, Tmean variances occur mostly due to distinguished differences in relief, LU/LC types and proximity to the coast over short distances. Wind speed (u 2 ) plays a substantial role, especially in August, when any increased u 2 values directly affect the ETo values, since those are not frequent in summertime. Future research could test the MLP1 performance for a larger period and across different areas of Greece that differentiate in micro-climatic conditions and regimes. Moreover, direct measurements such as pan evaporation measurements employing ANNs could be investigated on a local scale. Another interesting idea, based on the satisfactory results of this study regarding a sole input variable, would be to explore the potential of multilinear regression analysis, which is a simpler method and comprehensible by a wider interdisciplinary audience.  Acknowledgments: The authors acknowledge the National Observatory of Athens (https://meteosearch. meteo.gr (accessed on 15 April 2022)) for ground-based data availability of sixty-two meteorological stations.

Conflicts of Interest:
The authors declare no conflict of interest.