A Thorough Evaluation of 127 Potential Evapotranspiration Models in Two Mediterranean Urban Green Sites

: Potential evapotranspiration (PET) is a particularly important parameter for understanding water interactions and balance in ecosystems, while it is also crucial for assessing vegetation water requirements. The accurate estimation of PET is typically data demanding, while speciﬁc climatic, geographical and local factors may further complicate this task. Especially in city environments, where built-up structures may highly inﬂuence the micrometeorological conditions and urban green sites may occupy limited spaces, the selection of proper PET estimation approaches is critical, considering also data availability issues. In this study, a wide variety of empirical PET methods were evaluated against the FAO56 Penman–Monteith benchmark method in the environment of two Mediterranean urban green sites in Greece, aiming to investigate their accuracy and suitability under speciﬁc local conditions. The methods under evaluation cover all the range of empirical PET estimations: namely, mass transfer-based, temperature-based, radiation-based, and combination approaches, including 112 methods. Furthermore, 15 locally calibrated and adjusted models have been developed based on the general forms of the mass transfer, temperature, and radiation equations, improving the performance of the original models for local application. Among the 127 (112 original and 15 adjusted) evaluated methods, the radiation-based methods and adjusted models performed overall better than the temperature-based and the mass transfer methods, whereas the data-demanding combination methods received the highest ranking scores. The adjusted models seem to give accurate PET estimates for local use, while they might be applied in sites with similar conditions after proper validation.


Introduction
Evapotranspiration (ET) is a key component of the water cycle, while in rainfed ecosystems, it is the main consumer of available precipitation water [1][2][3].The anticipated climate trends suggest that the magnitude of ET will increase due to warming and changing precipitation patterns impacting the earth's ecosystems [4].Due to its significance, accurate measurements or estimates of ET are crucial.However, direct ET measurement by methods such as lysimeters [5] or eddy covariance [6,7] is difficult to obtain due to the high requirements of expensive equipment or application difficulties.The estimation of ET by common meteorological data is generally acceptable, since it is easier and in many cases produces reliable estimates.
The site-specific characteristics highly influence the ET magnitudes.Thus, numerous estimation models have been proposed worldwide with different approaches, whereas the substrate at each site highly influences the ET rates [8].In general, four major groups of methods can be defined to classify the empirical ET models: the mass-transfer-based methods, the temperature-based methods, the radiation-based methods and the combination methods.In all cases, the proposed equations aim to provide reliable estimates of the water demand driven by atmospheric conditions by minimizing the impact of plant species, vegetation stage or soil.To accomplish this, the estimates of ET are generally mentioned as potential (PET) or reference evapotranspiration, which are two different terms for expressing the water demand with different conceptual physical bases.The selection of the appropriate PET method is particularly important as it affects hydrometeorological and climatic variables that are linked to the sustainability of natural ecosystems [9].
Raza et al. [10] performed a comprehensive review on studies using several empirical evapotranspiration models and found that Thornthwaites' 1948 and Hargreaves-Samani's 1985 models were the most widely used among the temperature-based models, whereas Priestley 1972 andRitchie 1972 were also the most often used among the radiation-based ones.However, the Penman-Monteith model is the most widely used in all categories.
The Penman-Monteith model is generally accepted as the most accurate method to estimate maximum ET as also suggested by the FAO (Food and Agriculture Organization of the United Nations) and WMO (World Meteorological Organization).In many studies, FAO56-PM is used as the standard method to compare and evaluate the performance of other methods in specific sites, areas or regions [11][12][13][14][15][16][17].The FAO adopted the concept of reference evapotranspiration in the FAO guidelines for crop water requirements by Doorenbos and Pruitt [18,19].This approach to calculating crop evapotranspiration is widely accepted by engineers, agronomists and researchers in practice, design and research.The reference concept relates to a growing reference grass crop and is represented in FAO-24 by climate types calibrated with lysimeter data from various locations [20].However, many have pointed to weaknesses in the FAO-24 methodologies for implementation on a global scale.Researchers have tried to improve the evapotranspiration estimations for different locations and data availability through experimental and theoretical studies.First, the correlation of the calculated crop evapotranspiration with a reference crop proved difficult.The definition of a grass variety and its morphological characteristics have not been standardized for different climatic conditions.Furthermore, grass management varies from site to site and over time within the same site.Others have suggested alfalfa as a reference crop, but they have encountered similar variety and management problems [11,[21][22][23][24].
The FAO 56 Penman-Monteith equation incorporating standardized roughness and the bulk surface resistance parameters is recommended as the globally used equation to represent the new definition of reference evapotranspiration, replacing the Penman combination model.Thus, the reference grass evapotranspiration is redefined as the evapotranspiration from a clipped extended grass surface of 12 cm height with a total surface resistance equal to 70 s m −1 .This change in definition and the choice of a specific calculation method is intended to help eliminate problems in measuring a true evapotranspiration rate and provide consistent estimates across regions of the globe.The use of the FAO Penman-Monteith equation overcomes the overestimation problems of the earlier FAO Penman combination method.A hypothetical calculation of reference evapotranspiration can be used to calibrate empirical evapotranspiration equations and be considered as the basis for determining crop coefficients where evapotranspiration cannot be measured simultaneously with specific crop evapotranspiration.
The need for new methods is generally imposed, because FAO56-PM produces accurate PET estimates, but for its application, a considerable number of meteorological parameters is required, which in many areas are not measured.Thus, the adjustment or calibration of simpler original method with fewer data requirements is very important to accurately estimate PET, particularly in regions where meteorological data are rare.
Solar radiation and air temperature are related parameters, considered as the most important for the determination of PET especially in summer [25,26], whereas relative humidity typically drives ET in winter [25].The impact of wind speed appears to be minor [25]; however, there are studies [27] indicating a strong wind dependence of PET.In all cases, the large spatial variability and the site-specific characteristics are considered as key factors for the formation of PET [27,28] along with seasonality [25,26].
Several methods have been proposed for PET estimation.The method of Hargreaves and Samani (1985) was extensively used in many applications due to the low data requirements as well as its simplicity in application.Similar approaches were proposed by many authors including Schendel [29], Baier-Robertson [30], and Trajkovic [31].Shirmohammadi-Aliakbarkhani and Saberali [32] suggested that the Hargreaves-Samani method is a simple and reliable alternative for the estimation of ET in arid areas of Iran by assessing meteorological data from 13 sites in northeast Iran.The methods of Thornthwaite, Priestley and Taylor, Makkink and Abtew are recommended for humid climates, while this of Hargreaves and Samani is recommended for arid and semi-arid conditions, and those of Hamon and Linacre are recommended for all climates.
In general, simple empirical equations were evaluated for a variety of climates and regions worldwide, presenting different performances and imposing also the need for local calibration.Lang et al. [16] investigated the performance of eight methods in southwestern China and found high variability between different regions.The authors found that Hargreaves-Samani, Priestley-Taylor and Abtew were overestimating and Makkink, Thornthwaite, Hammon, Linacre and Blaney-Criddle were underestimating ET, although they addressed the good performance of specific methods when applied to specific regions of southwestern China.Lang et al. [16] also supported the overall better performance of the radiation-based methods compared to the temperature-based ones, proposing Makkink as the best radiation method and Hargreaves-Samani as the best temperature method for their study area.
Similarly, Makkink was reported to perform well in Malaysia [33], but its performance was poor in the southeastern United States [34], and this was attributed to the different climatic conditions and geographical environments [16].Priestley-Taylor was suggested by Wei and Menzel [35] as the most suitable method for global application.Thornthwaite was found to perform worst in many regions [16,34,36,37], which was probably because it takes into consideration only temperature and because it was established in a valley's humid climate.There are, however, many studies suggesting Thornthwaite as a well-performing method, e.g., in Malaysia [38,39].
Bourletsikas et al. [14] evaluated the performance of 24 empirical PET models in a forest ecosystem in central Greece, using daily data for a 17-year time period and several statistical indices.They suggested the use of Copais and original Hargreaves methods for the daily PET estimation in forest environments, which were followed by Valiantzas (T, Rs) and Valiantzas (T, Rs, RH).The authors also proposed using the models of Turc, modified Hargreaves-Samani after Droogers and Allen (2002), the Sun Thermal Unit (STU), and Jensen-Haize, which also had a good performance.They also recommended local calibration for the use of all tested mass transfer-based methods (Albrecht, Mahringer, Penman, Romanenco, WMO), as well as Abtew, Caprio, de Bruin-Keijman, FAO24 Radiation, Hansen, Makkink, McGuiness-Brondne, Priestley-Taylor and modified Thornthwaite by Siegert and Schrodter.
In all cases, the characteristics of the surfaces, the prevailing local conditions and the number of input parameters in the empirical models affect the accuracy of the PET estimates.Bogawski and Bednorz [40] reported on the decreasing performance of PET empirical methods with data availability.
Assessments of PET are typically performed in agricultural areas or on the larger scale of a basin.In the urban environment, PET is generally neglected, since the built-up cities covered by a variety of materials prevent the free movement of water or make it difficult to be studied.However, in urban green areas (i.e., parks), PET is of critical importance, determining the water requirements of the urban vegetation for its survival in the city's unfavorable environment, which are characterized by increased temperatures and thermal stress as well as reduced water vapor content and decreased water quantities for irrigation, especially in Mediterranean and arid climates.In a recent study by Zhou et al. [41], the authors describe the complex heat storage and shading effects in the urban environment, underlining also that only neglecting the shading effects leads to an overestimation of urban evapotranspiration of about 38.7%.In addition, the variable reflectance characteristics of the urban surfaces (even green ones) and surface temperatures in association with urban heat island and drought phenomena are highly affecting ET [42][43][44] in the cities.
The aim of this study is to extend the existing knowledge and understanding about the impact of the built-up environment on the water requirements of urban vegetation, considering the significance of urban green spaces and their multiple socioeconomic and environmental benefits [45,46].Toward this goal, 112 empirical PET methods were thoroughly evaluated against the benchmark FAO56-PM method in the Mediterranean environment of two Greek cities.Specifically, high-quality data from meteorological stations located above two urban green sites were used to test the performance of the methods including temperature-based, radiation-based, mass transfer and combination approaches, distinguishing the most suitable ones under different conditions and data availability schemes.In addition, locally adjusted mass transfer, temperature and radiation-based models are developed for enhancing the accuracy of PET estimations while maintaining low data requirements.Apart from the evaluation of a significantly high number of methods which have been rarely used in the literature, this study focuses on the research of micrometeorological aspects of urban green areas, which can provide crucial information for this vital resource for sustainable and quality life in the city under a changing climate.

Study Sites and Instrumentation
The present study was conducted in urban green areas in two cities in Greece: Amaroussion (central Greece) and Heraklion (South Greece-Crete island).The sites' locations are presented in Figure 1.
The site in Heraklion (35.31 • N, 25.14 • E, alt.: 81 m a.s.l.) is located in the island of Crete in the southern part of Europe.It is also an urban green area covered to a lesser degree by vegetation.The vegetation in the site includes trees, shrubs and herbaceous plants.The trees are generally deciduous broad-leaved (e.g., Ficus carica L., F. elastica Roxb., Citrus reticulata L., C. limon L., Olea europaea L., Pinus brutia Tenore.) and randomly distributed in the site.The shrub-covered surfaces host a variety of species (e.g., Pittosporum tobira (Thunb.)W.T. Aiton, Nerium oleander L., Rosmarinus officinalis L.) in mixed patterns with herbaceous plants (e.g., Convolvulus arvensis L., Glebionis coronaria (L.) Spach, Malva sylvestris L., Medicago lupulina L., Oxalis pescaprae L.).The climate in the area is sub-humid [48,49] according UNEP's [50] aridity classification system based on Thornthwaite's [51,52] water balance approach, presenting also high decadal variability to warmer [54,55], more arid conditions [49,56] with more frequent droughts in the recent years compared to the past [55].In the two sites, two micrometeorological stations were established for the constant monitoring of the aerial and soil environment.Both stations were equipped with sensors measuring temperature-relative humidity (EE08, E+E Elektronik Ges.m.b.H., Engerwitzdorf, Austria), wind speed and wind direction (Small Wind Transmitter, THIES CLIMA, Adolf Thies GmbH & Co. KG, Göttingen, Germany), precipitation (PROFESSIONAL, Pronamic ApS, Skjern, Denmark), global solar radiation at wavelengths 305-2800 nm (Pyranometer SP-Lite, ADCON Telemetry, Klosterneuburg, Austria, with a sensitivity change of 2% per year), and photosynthetically active radiation at 400-700 nm (QSO-S Quantum sensor, Apogee Instruments, Inc., Logan, Utah, USA, with ±5% accuracy).The measurements were conducted every 5 s, and the 10 min averages were recorded.
The available data cover the time period from 24 September 2019 to 31 December 2022 in Amaroussion and from 18 October 2019 to 31 December 2022 in Heraklion.During these periods, the monthly values of temperature, relative humidity, wind speed and precipitation in the two sites are presented in Figure 2. The acquired data patterns are rather expected for the climatic patterns of these areas.

PET Methods
The estimation of PET was performed by employing 112 empirical methods, which can be categorized into four distinct groups based on their required variables for their application: • 12 mass-transfer-based methods following the general form of PET = f (u, T, RH).These methods are based on the assumption that evapotranspiration is affected by the air movements considering also atmospheric dryness, which is expressed by the difference between air vapor pressure at saturation (e s ) and actual vapor pressure (e a ).In all cases, the vapor pressure deficit effect is corrected by the addition of the aerodynamic term as a function of wind speed u.For the PET estimation, wind speed (u), air temperature (T) and relative humidity (RH) data are required.13)-( 46)); PET = f (T, RH), 13 methods (Equations ( 47)-( 59)); and PET = f (T, PR), 1 method (Equation ( 60)), presented in Table 2.

Temperature-Based Methods
Equation PET = f (T) * Equation Ref.  13), ( 25), ( 27), ( 28), (33) and (34)), p represents the daily percentage (%) of annual daytime hours for each day of the year, N represents the maximum sunshine daily hours, T, T max and T min are the daily mean, maximum and minimum air temperatures in Copais where m 1 = 0.057, m 2 = 0.277, m 3 = 0.643, m 4 = 0.0124 * where T, T max and T min represent the average, maximum and minimum daily air temperatures in • C, RH is the relative humidity in %, ϕ is the latitude in radians, Rs, Rn and Ra are the global solar, the net and the extraterrestrial radiation fluxes, respectively, in MJ m −2 day −1 , G is the soil heat flux in MJ m −2 day −1 (G = 0), ∆ is the slope of the vapor pressure curve (kPa Table 4. Combination methods.

Combination Methods
(112) [122,123] * where T and T min are the average and minimum daily air temperatures in • C, T d is the dewpoint in • C, RH is the relative humidity in %, ϕ is the latitude in radians, z is the altitude in m, Rs, Rn and Ra are the global solar, the net and the extraterrestrial radiation fluxes in MJ m −2 day −1 , G is the soil heat flux in MJ m −2 day −1 (G = 0), ∆ is slope of the vapor pressure curve (kPa • C −1 ), γ is the psychrometric constant (kPa • C −1 ), J is the day of the year, λ= 2.501 − 0.002361 T, in MJ kg −1 , e s and e a are the saturation and actual vapor pressures in kPa, u is the windspeed at height 2m in m/s and u b is the windspeed in Beaufort.
The equations of all models used in this work are presented for each group below.Details for the estimation of the parameters used in the equations can be found in Allen et al. [11] and Proutsos et al. [138,139].The PET estimates with negative values were excluded for the analysis.

Statistical Indices and Ranking
To compare the estimations of PET by the different models against the estimates by FAO56-PM, the commonly used coefficients of the linear regression y = ax + b were employed: slope a, intercept b and coefficient of determination R 2 .Four additional statistical measures recommended by Fox [140] were applied: the mean bias error (MBE) to assess the bias, the variance of the differences distribution s 2 d to evaluate the variability of the differences between the PET values around the MBE, the mean absolute error (MAE) and the root mean square error (RMSE) to express the average difference.The index of agreement (d) was also used to make the cross-comparison between the models [141][142][143].The analytic equations for the estimation of the indices are presented in Appendix A (Equations (A1)-(A5)).
To rank the methods, the above indices were used, and through a standardization procedure proposed by Aschonitis et al. [144] and also described in Rahimikhoob et al. [145], the standardized ranking performance index (sRPI) was estimated by the equations presented in Appendix A (Equations (A6)-(A9)).

Results
The micrometeorological stations of this study were installed above grass-covered irrigated surfaces inside the urban green spaces.Such surface characteristics allow the accurate estimation of PET by the application of the Penman-Monteith method considering that the measured meteorological parameters are highly affected by the substrate above which the measurements are taken [8].
The PET estimates with the FAO56-PM method for the two cities present higher values for the southern site of Heraklion with an annual average of 3.37 ± 1.92 mm d −1 , which is slightly higher compared to the respective values of Amaroussion (3.10 ± 1.92 mm d −1 ).Both sites present high seasonal variability with ET values ranging from 1.44 ± 0.49 mm d −1 in winter to 5.87 ± 0.77 mm d −1 in summer in Heraklion and from 1.05 ± 0.41 mm d −1 in winter to 5.48 ± 1.00 mm d −1 in summer in Amaroussion.The day-to-day and monthly values are even more variable, as depicted in Figure 3.
The daily values of Figure 3 were used as the basis for comparing PET with the respective estimates by the application of other methods.The results per method category follow.
The ranking scores for both sites (derived as averages of the sRPI) suggest that Mahgringer 1970 (Equation ( 10)) had the best performance among the mass transfer methods, followed by WMO 1966 (Equation ( 9)) and Linacre 1992 (Equation ( 12)), which ranked 45th, 47th and 49th among the 112 examined models with sRPI values of 0.827, 0.826 and 0.822, respectively.The correlations of the five best-performing models of this category are presented in Figure 4.

Temperature-Based Methods
The PET estimates by the application of 48 temperature-based empirical models (Equations ( 13)-( 60)) are presented against the respective daily values by FAO56-PM for the two sites in Figures A2 and A3 (Appendix B).The general patterns indicate generally higher estimates of the method of this category in Heraklion compared to the site in Amaroussion.The statistics from the comparisons for all methods in both sites are presented in Table A2 (Appendix C).
The statistical indices and the ranking (among the 112 examined models) for the five best-performing temperature-based methods for each site are shown in Table 6.28)) are ranked higher among the temperature-based PET methods (27th, 31st and 35th, respectively, at the overall ranking) with sRPI scores of 0.888, 0.877 and 0.869, respectively.It is worth noting that all 48 temperature-based methods received sRPI scores ranging from 0.487 to 0.888, and 15 of them had sRPI values greater than 0.800, whereas 4 out of the 12 mass transfer-based methods had sRPIs greater than 0.800.The correlations of the daily value estimated by the five best-performing methods of this category against the FAO56-PM method are presented for both sites in Figure 5.

Radiation-Based Methods
The 40 radiation-based methods (Equations ( 61)-( 100)) examined in the two study sites produced daily estimates presented in conjunction with the FAO56-PM estimates in Figures A4 and A5 (Appendix B).The comparison between the values produced the statistics presented in Table A3 (Appendix C).The statistical indices for the five bestperforming radiation-based methods in each site are presented in Table 7.
Table 7. Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the five better-performing radiation-based models for the estimation of PET compared to the benchmark method of FAO56-PM in the two urban green sites of Heraklion and Amaroussion.In Heraklion, the best-performing radiation-based methods were Ahooghalaandari et al. 2017 (2) (Equations ( 99)) followed by Castañeda and Rao 2005 (2) (Equation ( 79)) and Priestley and Taylor 1972 (Equation ( 85)), which were ranked 4th, 8th and 9th among all 112 models, with sRPI scores of 0.955, 0.943 and 0.941, and mean PET estimates +7.7%, −0.7% and −0.5% different compared to FAO56-PM, respectively.Ahooghalaandari et al.  87)) ranking 103rd with sRPI = 0.603, producing PET means −30.7% and +67.6% different compared to FAO56-PM.In general, however, the radiation methods in Heraklion had a good performance in most cases, since the produced PET means were less than 10% different from FAO56-PM in 19 out of the 40 methods.

PET
In Amaroussion, Priestley and Taylor 1972 (Equation ( 85)) was ranked first among the radiation-based methods (2nd among all 112 models, with sPRI = 0.972), followed by Abtew 1996 (4) (Equation ( 86 87)) ranking 102nd were the two worst methods with average sPRI values from both sites 0.543 and 0.612, respectively.The best five performing methods for both sites (according to the average sRPI scores) are depicted in Figure 6.

Combination Methods
The PET estimates from the 12 combination methods (Equations ( 101)-( 112)) assessed in this study are depicted against the PET daily values in Figure A6 (Appendix B), whereas the statistical indices values used for the ranking of the methods are presented in Table A4 (Appendix C).The graphs and the statistical results suggest that this category of models produces good PET estimates compared to all other categories.
The statistics for five best-performing methods of this PET model category are presented for both sites in Table 8.The assessment of all combination methods statistics, presented also in Table A4 103)), which were ranked 58th and 53rd with sRPI values of 0.792 and 0.813, respectively.The mean PET values of these methods were +25.3% and +20.4% higher compared to FAO56-PM.
In Amaroussion, Wright 1996 (Equation ( 108)) was the best-performing model ranked 1st with sRPI = 0.992 followed by Jensen et al. 1990 (Equation ( 106)), Valiantzas 2006 (2) (Equation ( 109)), and Valiantzas 2013 (6) (Equation ( 112)), which were ranked 3rd, 4th and 5th with similar sRPI values (0.963, 0.962 and 0.962).Wright 1996 (Equation ( 108)) showed the best slope a (0.984) and d (0.991) and the minimum RMSE (0.393 mm d −1 ), MAE (0.264 mm d −1 ) and sd 2 (0.159 mm d −1 ) values, producing a mean PET estimate +0.97% higher compared to FAO56-PM.Also, in Amaroussion, Jensen et al. 1990 (Equation ( 106)) had the best offset b (0.003), but its mean PET was +12.3% higher compared to FAO56-PM.As in Heraklion, the worst combination methods for Amaroussion were also FAO24 Radiation (Equation ( 105)) followed by the modified Makkink by Doorenbos and Pruitt 1977 (Equation ( 103)), which ranked 54th and 49th, respectively, among the 112 models, presenting relatively low sRPI values (0.819 and 0.828) and also mean PET values +37.3% and +33.1% higher compared to FAO56-PM.Table 8.Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ) and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the five better-performing combination PET models compared to the FAO56-PM benchmark method in the two urban green sites of Heraklion and Amaroussion.The combination methods ranking for both sites depicts Wright 1996 (Equation ( 108)) as the best combination model, followed by Valiantzas 2006 (2) (Equation ( 109)) and Jensen et al. 1990 (Equation ( 106)).These models were ranked 1st, 2nd and 3rd among all 112 investigated methods and received the highest sRPI scores (average sRPI scores from both sites: 0.990, 0.966 and 0.963, respectively).The daily PET estimates by the five best-performing combination methods against FAO56-PM are presented in Figure 7.In all cases, however, the combination methods performed better compared than all other method categories, since they presented high sRPI scores (higher than 0.806), which is rather expected considering the higher number of input parameters required for the application of the combination equations.

Models Adjustment
The local calibration of the empirical models for the PET estimation is suggested in most research works and is also imposed by the results of the present study.In this work, an adjustment of the general forms of mass transfer, temperature and radiation-based equations was performed for local use in the territories of our study sites.Based on the daily data from both stations, 15 adjusted PET models were produced following the general forms of several widely used equations.For example, the mass transfer model proposed by Dalton 1802 (Equation (1)), Fitzgerald 1886 (Equation ( 2)), Meyer 1926 (Equation ( 4)), Rohwer 1931 (Equation ( 5)), Albrecht 1950 (Equation ( 7)) and WMO 1966 (Equation ( 9)) follow the general form of PET = (a + bu) (e s -e a ).The adjusted values of a and b, based on the data from the two stations, are presented in Table 9.Similarly, other widely used models were adjusted for local use, and the new models are also presented in Table 9.The performance of the adjusted equations (Equations ( 113)-( 127)) is evaluated following the estimation of statistical indices and ranking as above.The daily PET estimates for the new models are presented for the two sites along with the respective PET values by the FAO56-PM method in Figure 8.
* where T, T max and T min are the average, maximum and minimum daily air temperatures in • C, RH is the relative humidity in %, Rs and Rn are the global solar and net radiation fluxes in MJ m −2 day −1 , Ra is the extraterrestrial radiation in mm d −1 , ∆ is the slope of the vapor pressure curve (kPa • C −1 ), γ is the psychrometric constant (kPa • C −1 ), λ = 2.501 -0.002361 T, in MJ kg −1 , e s and e a are the saturation and actual vapor pressures in kPa, u is the windspeed at height 2 m in m s −1 , C 1 and C 2 are factors presented in Table 3, Equation ( 92)).
The daily PET dispersion of values depicted in Figure 8 in association with the statistical indices of the new methods and the ranking with respect to all 127 models (112 original and 15 adjusted) in both sites that are presented in Table 10 suggest that the adjusted models performed better compared to the original equations.
More specifically, the mass transfer models 1 (Equation ( 113)) and 2 (Equation ( 114)) were ranked 66th and 64th (with sRPI scores of 0.803 and 0.813), respectively, among all 127 models, in Heraklion, whereas in Amaroussion, they performed better (ranked 42nd and 44th, with similar sRPI scores of 0.867 and 0.866, respectively).Similarly, the adjusted temperature-based models 3, 4, 5, 6 and 15 (Equations ( 115)-( 118) and ( 127)) were ranked between 26th and 97th with scores ranging from 0.701 to 0.916, in Heraklion, among which model 4 performed the best (Equation ( 116)), which is actually an adjustment of the Hargreaves and Samani method.The temperature-based adjusted models in Amaroussion presented also good performance, and they ranked between 21st and 84th among the 127 methods, with sRPI ranging from 0.783 to 0.913, among which model 4 performed the best (Equation ( 116)).Finally, the radiation-based adjusted models 7-14 (Equations ( 119)-( 126)) produced in general accurate estimates.Their sRPI scores, in Heraklion, ranged from 0.851 to 0.960, resulting in ranks varying from 4th to 50th, among which model 10 performed the best (Equation ( 122)).In Amaroussion, model 8 had an excellent behavior, ranking 2nd among all 127 methods, with a high sRPI value (0.972), whereas the rest of the radiation-based adjusted models also received high sRPI scores ranging from 0.819 to 0.972, with ranks varying between 7th and 67th.113)-( 127)) compared to the FAOs56-PM in the urban green sites of Heraklion and Amaroussion.

Discussion
The PET estimates of the examined 112 models in this work confirm the overall good performance of the combination methods against all other groups of methods in the environment of the two Mediterranean urban green sites, i.e., in Heraklion (S.Greece) and Amaroussion (c.Greece).The general ranking of the methods for both sites indicate that the method of Wright 1996 (Equation ( 108)) performed the best followed by Valiantzas 2006 (2) (Equation ( 109 79)).The above ten are the best-performing methods for both sites, producing the best statistics and the highest sRPI scores (higher than 0.936).
The above-mentioned results confirm the generally increasing performance of empirical PET estimation methods with the number of input parameters [40] with the high data demanding combination methods to produce more accurate estimates.The performance of the radiation-based equations is adequate, and it ranked high among methods with limited data requirements.The better performance of the radiation methods compared to temperature-based is expected and has been confirmed also by Lang et al. [16], who applied different empirical PET models in southwestern China, suggesting Makkink's model as the best alternative.In the present work, Makking's original equation was found to perform quite well, ranking 25th among the 112 examined models with an average, for both study sites, rank score of sRPI = 0.889, whereas its modified form proposed by Castañeda and Rao 2005 (2) (Equation ( 79)) was ranked among the 10 best-performing methods for both examined sites and received a high sRPI score of 0.936.The good performance of the Priestley and Taylor method in this study (rank 5th/112, sRPI = 0.957) is also in line with the findings by Wei and Menzel [35], who suggested the specific method for global application.
It should be noted that the radiation-based methods requiring Rn radiation measurement are anticipated to perform better than those requiring Rs, since Rn is highly associated with the surface characteristics indicating the available energy stored in the natural surface and can be used for evapotranspiration.However, in this study, Rn is estimated from Rs [11], and thus, its effect cannot be evaluated as in the case of real in situ Rn measurements.In all cases, the best two radiation methods (included also among the 10 best out of the 112 original models) require Rn, i.e., Priestley and Taylor 1972 (Equation ( 85)) and Abtew 1996 (4) (Equation ( 86)).
The limitation of input parameters and the local calibration of the examined models appear to affect their performance in the two sites.It should be also mentioned that almost all models were established in rural areas, and thus, their application in urban environments (even in green spaces) may result in overestimations or underestimations.This is also valid for the FAO56-PM method, which is highly affected by the aerodynamic characteristics of the surface.In all cases, the energy budget and the aerodynamic characteristics of the urban green spaces are considerably different compared to the open rural areas, and the built-up urban environment highly affects the energy exchanging processes, the energy budget of the green surfaces, and the wind flow above them, resulting in a complex environment that is difficult to be modeled.Multiple radiation scattering by the built-up environment surrounding the urban green areas and shadowing, as well as the use of artificial materials covering parts of the soil, can result in decreased ET fluxes and overestimation of the applied PET models [41].However, the estimation of PET by the empirical models remains a useful tool to assess plants' water requirements, even at the urban environment.
The general ranking of the 127 methods (112 originals and 15 adjusted) after incorporating the scores for both sites are presented in Table A5 (Appendix C).The results suggest that many of the adjusted models performed better compared to the original equations.More specifically, the mass transfer models 1 (Equation ( 113)) and 2 (Equation ( 114)) were ranked 52nd and 51st (with sRPI scores of 0.835 and 0.839), respectively, among all 127 models, whereas the best original mass transfer method (Mahringer 1970 (Equation (10), sRPI = 0.827) is ranked 57th, WMO 1966 (Equation ( 9)) is ranked 59th, and all others were ranked much lower compared to the adjusted mass transfer models.
Among the adjusted temperature-based models 3, 4, 5, 6 and 15 (Equations ( 115)-( 118) and ( 127)), model 4 (Equation ( 123)), which requires only temperature data and is actually the adjustment of the Hargreaves-Samani equation, presented better performance, ranking 22nd (sRPI = 0.915) among the 127 methods and first among all temperaturebased models, which was followed by the best original method of Ahooghalaandari et al. 2016 (3) (Equation ( 58)), which ranked 35th/127 with sRPI = 0.888.It is worth noting the good performance of the adjusted Hargreaves-Samani Model 4 (Equation ( 116)) which is ranked 22nd/127, as mentioned, whereas its original form Hargreaves and Samani 1985 (Equation ( 22)) is ranked 86th/127 (sRPI = 0.751).It should be stated, though, that at the adjusted model 4, the power of the diurnal temperature range (DTR = Tmax − Tmin) is negative and small, suggesting a minor and negative effect of DTR on PET.Since DTR is considered to be related with atmospheric cloudiness and radiation factors that control plant photosynthesis [138,146,147] and that clear sky conditions (higher DTR) can be associated with higher evapotranspiration rates [148,149], it is rather expected for there to be a positive DTR effect on PET.On the other hand, in our two sites, clear sky conditions typically persist; thus, DTR is expected to have an overall minor effect on PET.
All adjusted models have reduced data requirements, allowing their local application in the two study sites.Nonetheless, it should be stressed that the models' performance will benefit from further adjustments, incorporating a longer timeseries of data from the two stations.Their application in other regions and cities should be performed with caution, following a proper validation.Furthermore, additional adjustments may be applied by incorporating data from new stations with different geographical characteristics.In any case, the local calibration can significantly improve the performance of the PET empirical models and is highly suggested especially in regions with a limited availability of meteorological data.In summary, the best-performing methods with rank scores (sRPI) higher than 0.950 (derived as average values from both study sites) are depicted in Table 11.

Conclusions
In the present work, the performance of 112 original empirical models for the estimation of potential evapotranspiration (PET) was investigated by comparing the models' outputs with the PET estimates by the FAO56-PM standard method in two urban green sites in Greece (Heraklion, S. Greece and Amaroussion, c. Greece).Based on the general forms of the original mass transfer, temperature and radiation-based PET models, 15 adjusted equations were also produced and evaluated for application at the local level.
The results confirm that the accuracy of the model increases with the number of the input parameters included in the estimations.The combination methods produced in general more accurate PET estimates, which are followed by the radiation, temperature and mass transfer-based methods.
The combination model proposed by Wright 1996 (Equation ( 108) ranked 1st among the 112 original models) had the best performance, which was followed by Valiantzas 2006 (2) (Equation ( 109), ranked 2nd) and Jensen et al. 1990 (Equation ( 106), ranked 3rd), which are also combination methods.However, it is important to note that the combination methods require the same input parameters as FAO56-PM; thus, the standard method might be applied directly.
Priestley and Taylor (Equation ( 85), ranking 5th among the 112 original models) was the best radiation-based model and Ahooghalaandari et al. 2016 (3) (Equation ( 58), ranked 27th/112) was the best temperature-based one.Regardless of their high data requirements, the mass transfer methods had insufficient performance, even after adjustment.However, Mahringer 1970 (Equation (10), ranked 45th/112) was the best model of this category.
The adjusted PET models enhanced the performance of the original methods in all cases on the local level of the two study sites.The radiation-based model 10 (PET = f (Rs, T, RH)) was ranked 4th among all 127 models (112 original and 15 adjusted), presenting a high rank score.Also, models 8, 13, 11 and 14 (all radiation-based) produced accurate estimates in both sites, received high scores (>0.951) and ranked among the 10 best-performing methods.Their application in the two sites is recommended in the case of limited data availability; however, their applicability in other regions should be cautiously performed after proper validation and adjustment.
For wider application, it is proposed to test the methods in other cities around the world to evaluate the accuracy of the estimation of urban vegetation water requirements.It is essential though to underline the critical importance of the quality of measurements of the input parameters that should be obtained above irrigated, grass-covered surfaces, allowing the proper application of the FAO56-PM method.The findings of this study can be useful for the estimation of PET in Mediterranean cities and especially in areas with limited data availability.This can be particularly useful toward informed decision making for urban green infrastructure, including plant species selection, irrigation scheduling and water management as well as urban green management.
The findings from the present study, which is based on ground data, are a useful resource for determining the most appropriate method (especially at the local level) for estimating vegetation water requirements under the Mediterranean climate conditions.Based on the above principal information, using remote sensing-satellite data in the most appropriate PET methods identified in the two investigated sites, may produce more accurate local estimates.In future work, the performance of the PET methods can be evaluated by applying both satellite and ground data, and we can compare the methods performances.Further research is also required in order to validate the performance of the adjusted models by incorporating longer data series.In future work, the authors intend to investigate the performance of the original and adjusted models in other environments (urban or rural).

Appendix C
Values of the statistical indices used in the present work for the ranking of the 112 PET models.The results are presented for both study sites (Heraklion and Amaroussion) and are grouped per category of methods.
The final ranking of all examined models, including the 112 original and 15 adjusted, is also presented in the last table.
Table A1.Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the 12 mass transfer-based modes (Equations (1)-( 12)) for the estimation of PET compared to the benchmark method of FAO56-PM in the two urban green sites of Heraklion and Amaroussion.13)-( 60)) compared to the benchmark method of FAO56-PM in the two urban green sites of Heraklion and Amaroussion.

Figure 1 .
Figure 1.(a) Map of the sites and (b) photos of the meteorological stations installed in the urban green spaces (UGSs) of (a) Heraklion (S.Greece-Crete island), and (c) Amaroussion (central Greece).

Figure 2 .
Figure 2. Monthly average, minimum and maximum values of (a) air temperature in Heraklion, (b) air temperature in Amaroussion, (c) relative humidity in Heraklion, (d) relative humidity in Amaroussion, (e) wind speed and gust in Heraklion, (f) wind speed and gust in Amaroussion, (g) precipitation in Heraklion and (h) precipitation in Amaroussion.

Figure 3 .
Figure 3. (a) Daily and (b) monthly PET, estimated by the FAO56-PM method at two urban green spaces in the cities of Heraklion and Amaroussion.Vertical lines show the standard deviations.Table 5. Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI score and rank) based on the optimum values of the statistical indices for the five best mass transfer-based PET modes compared to the FAO56-PM base method in the two urban green sites of Heraklion and Amaroussion.

Figure 4 .
Figure 4. Correlation between daily PET values estimated by the best five mass transfer methods (x-axis) against the benchmark method of FAO56-PM (y-axis) for the two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure 5 .
Figure 5. Correlation between daily PET values estimated by the best-performing temperature-based methods (x-axis) of the general forms PET = f (T, RH or PR) against the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure 6 .
Figure 6.Correlation between daily ET values estimated by the five best-performing radiation-based methods (x-axis), against the FAO56-PM benchmark method (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure 7 .
Figure 7. Correlation between daily PET values estimated by the five better-performing combination methods (x-axis) against the FAO56-PM benchmark method (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure 8 .
Figure 8. Correlation between daily PET estimated by the adjusted models (x-axis) and the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line depicts the 1:1 regression.Table 10.Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the 15 adjusted PET models (Equations (113)-(127)) compared to the FAOs56-PM in the urban green sites of Heraklion and Amaroussion.

Figure A2 .
Figure A2.Correlation between daily PET values estimated by different temperature-based methods (x-axis) of the general forms PET = f (T) and the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure A3 .
Figure A3.Correlation between daily PET values estimated by different temperature-based methods (x-axis) of the general forms PET = f (T, RH or PR) and the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure A4 .
Figure A4.Correlation between daily ET values estimated by different radiation-based methods (xaxis) of the general forms PET = f (Rs) and PET = f (Rs, T) with the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure A5 .
Figure A5.Correlation between daily ET values estimated by different radiation-based methods (x-axis) of the form PET = f (Rs, T, RH) with the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Figure A6 .
Figure A6.Correlation between daily ET values estimated by different combination methods (x-axis) and the benchmark method of FAO56-PM (y-axis) for two urban green areas in Amaroussion (gray points) and Heraklion (red points) along with the linear regression statistics.The blue line indicates the 1:1 regression.

Table 6 .
Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the best five temperature-based PET models compared to the benchmark method of FAO56-PM in the two urban green sites of Heraklion and Amaroussion.

Table 9 .
Adjusted PET models for local use in Heraklion and Amaroussion.

Table 11 .
Ranking of the best-performing PET estimation models among all 127 investigated methods with sRPI rank scores higher than 0.950.

Table A2 .
Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the 48 temperature-based PET models (Equations (

Table A3 .
Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the 40 radiation-based models (Equations (61)-(100)) for the estimation of PET compared to the benchmark method of FAO56-PM in the two urban green sites of Heraklion and Amaroussion.

Table A4 .
Statistical indices (mean, slope a, intercept b, and coefficient of determination R 2 , of the linear regression y = ax + b, mean bias error (MBE), root mean square error (RMSE), mean absolute error (MAE), standard deviation square (sd 2 ), and index of agreement d) and ranking (sRPI Score and Rank) based on the optimum values of the statistical indices for the 12 combination models (Equations (101)-(112)) for the estimation of PET compared to the benchmark method of FAO56-PM in the two urban green sites of Heraklion and Amaroussion.