Prediction of Potential Evapotranspiration Using Temperature-Based Heuristic Approaches

: The potential or reference evapotranspiration ( ET 0 ) is considered as one of the fundamental variables for irrigation management, agricultural planning, and modeling different hydrological pr ◦ Cesses, and therefore, its accurate prediction is highly essential. The study validates the feasibility of new temperature based heuristic models (i.e., group method of data handling neural network (GMDHNN), multivariate adaptive regression spline (MARS), and M5 model tree (M5Tree)) for estimating monthly ET 0 . The outcomes of the newly developed models are compared with empirical formulations including Hargreaves-Samani (HS), calibrated HS, and Stephens-Stewart (SS) models based on mean absolute error (MAE), root mean square error (RMSE), and Nash-Sutcliffe efﬁciency. Monthly maximum and minimum temperatures ( T max and T min ) observed at two stations in Turkey are utilized as inputs for model development. In the applications, three data division scenarios are utilized and the effect of periodicity component (PC) on models’ accuracies are also examined. By importing PC into the model inputs, the RMSE accuracy of GMDHNN, MARS, and M5Tree models increased by 1.4%, 8%, and 6% in one station, respectively. The GMDHNN model with periodic input provides a superior performance to the other alternatives in both stations. The recommended model reduced the average error of MARS, M5Tree, HS, CHS, and SS models with respect to RMSE by 3.7–6.4%, 10.7–3.9%, 76–75%, 10–35%, and 0.8–17% in estimating monthly ET 0 , respectively. The HS model provides the worst accuracy while the calibrated version signiﬁcantly improves its accuracy. The GMDHNN, MARS, M5Tree, SS, and CHS models are also compared in estimating monthly mean ET 0 . The GMDHNN generally gave the best accuracy while the CHS provides considerably over/under-estimations. The study indicated that the only one data splitting scenario may mislead the modeler and for better validation of the heuristic methods, more data splitting scenarios should be applied.


Introduction
Reference evapotranspiration (ET 0 ) is one of the major components in the hydrological cycle [1]. It contributes to a rationale water resources management [2,3], and it is important in agriculture for measuring crop water requirement quantification [4]. In addition, ET 0 is used as inputs for several hydrological models, and adopted for climate change studies [5]. Several empirical and semi-empirical methods have been developed at different time scales for ET 0 prediction. The performance of the methods varies with respect to the meteorological variables included in the methods, ranging from temperature based, radiation based, and combination methods [6]. One of the earliest methodologies, the ET0 calculated using the standards FAO56 Penman-Monteith, has been adopted as a reference approach [7]. However, ET0 can be measured directly using lysimeters [2]. Measurement of ET0 is varied from one region to another and that is totally based on the regional climate characteristics [8]. Hence, empirical formulation is demonstrated as a remarkable limitation on the ET0 estimation. During the last few decades, models based on computer aid capacity indicated a distinguished progress in the hydrology and water resources fields [9][10][11][12][13]. Artificial intelligence (AI) models have been extensively applied as a reliable soft computing technology for ET0 estimation based on the available and measured climatic variables [14][15][16].
A review of the literature indicates that numerous studies have examined the application of several AI models for estimating ET 0 [17][18][19][20][21][22][23][24][25][26][27][28]. In more detailed state-of-the-art, Yin et al. [26] introduced a new hybrid AI model dependent on the hybridization of a genetic algorithm with a kernel model i.e., a support vector machine (GA-SVM) for modeling daily ET 0 in China using several daily climatic variables including T max , T min , wind speed (U 2 ), relative humidity (RH), and solar radiation (SR). Compared to classical an artificial neural network (ANN) and the SVM models, the scholars demonstrated the superiority of the GA-SVM in predicting ET 0 . Jovic et al. [17] proposed a hybrid method called genetic programming (GP) for estimating ET 0 using eight climatic variables. Mattar [20] applied gene expression programming (GEP) for modeling monthly ET 0 in Egypt, using five input variables T max , T min , RH, U 2 and Rs. Tao et al. [25] introduced a hybrid method called adaptive neuro-fuzzy inference systems (ANFIS) with a firefly algorithm (FA) (ANFIS-FA) for modeling daily ET 0 at Burkina Faso. Using six input variables namely, T max , T min , maximum relative humidity (RH max ), Rs, U 2 , and vapor pressure deficit (VP), the authors demonstrated that the FA significantly increased the exactness of the ANFIS method, and that the hybrid ANFIS-FA provided high accuracy with a determination coefficient (R 2 ) nearly equal to 0.97 compared to a R 2 of 0.91 obtained using standard ANFIS. Using data from India, Adamala [27] compared four models in predicting daily ET 0 , using fewer inputs variables: T max , T min , and extra-terrestrial radiation (Ra). The applied models were a wavelet neural network (W-ANN), ANN, multi linear regression (MLR), and wavelet linear regression (W-MLR). The authors reported that decomposition of the input variables applying wavelet transform significantly improved the performance of the models with Nash-Sutcliffe efficiency (NSE) equal to 0.82. Using T max , T min , RH, U 2 , Rs, and sunshine hours (SH), Gavili et al. [28] demonstrated that ANN model was better than GEP and ANFIS for predicting monthly ET 0 in Iran, with a NSE value that reached 0.98 during the testing phase for all tested stations. Khoshravesh et al. [19] compared three regression methods, namely multivariate fractional polynomial (MFP), Bayesian (BR), and robust regressions (RBR) for modeling monthly ET 0 in Iran, using T max , T min , mean temperature (T mean ), and Rs. Karbasi [29] built a new Gaussian process regression (GPR) for forecasting daily ET 0 several days in advance and demonstrated that the use of wavelet decomposition significantly increased the abilities of the methods and the RMSE dropped from 0.816 mm to 0.068 mm.
Among several machine learning models explored over the literature, the group method of data handling type neural network (GMDHNN) is an dependent on the Rosenblatt's perceptron method introduced by Farlow [30]. GMDHNN is successfully applied in diverse engineering applications [31][32][33][34]. Within hydrology and water resources related research, Najafzadeh et al. [35] developed the GMDHNN model for scour depth (SD) of pipelines estimation due to waves variability; the prediction of local SD at bridge abutments in coarse sediments with thinly armored beds was conducted by Najafzadeh et al. [36]; simulation of flow discharge of straight compound channels was reported by Najafzadeh and Zahiri [37]; prediction of significant wave height was established by Shahabi et al. [38]; prediction of turbidity considering daily rainfall and discharge data was determined by Tsai and Yen [39]; an improved modeling of the discharge coefficient for triangular labyrinth lateral weirs was described by Parsaie and Haghiabi [40]; an evaluation of treated water quality in a water treatment plant was carried out by Alitaleshi and Daghbandan [41]; a prediction of turbidity and the free residual aluminum of drinking water was tested by Daghbandan et al. [42]. Based on the reported literature review, only one study reported the implementation of the GMDHNN ET 0 modeling developed by da Silva Carvalho and Delgado [43]. The study conducted based on the calculated FAO56 Penman-Monteith using only previous values and very limited data of daily scale over three years (January 2011 to January 2014), were utilized for the modeling development.
Multivariate adaptive regression splines (MARS) model introduced by Friedman [44], and M5 tree (M5Tree) model established developed by Quinlan [45]. They are another distinguished category of a data driven model, which is mainly used in environmental, hydrology, irrigation, and hydraulic studies. The MARS model is one of the sophisticated AI models, as it has the ability to provide a non-parametric feature that is able to identify the actual relationship between predictors and predicted using splines method for detecting the nonlinearity pattern [46]. The MARS model has been successfully applied in many hydrological applications [47][48][49][50][51][52]. The MARS model was successfully used to predict water pollution by Kisi and Parmar [47], to forecast sediment load by Adnan et al. [48], to model daily streamflow by Yin et al. [49], to predict evaporation by Ghaemi et al., [50], and to predict monthly river flow by Adnan et al. [51].
However, fewer applications related to the ET 0 modeling can be seen in the related literature. For instance, Mehdizadeh et al. [53] compared MARS, SVM, and GEP for modeling monthly ET 0 in Iran, using several climatic variables as inputs: T max , T min , T mean , RH, U 2 , VP, Ra, Rs, and Rn. The authors have compared several scenarios, namely temperature-based, radiation-based, mass transfer-based, and meteorological parametersbased scenarios. For the temperature-based scenarios, using only T max , T min , and Ra, the research finding approved the potential of MARS model over the SVM and GEP with a R 2 equal to 0.944 in the validation phase. Mehdizadeh et al. [22] investigated the capacity of MARS and GEP models for estimating daily ET 0 in Iran using four climatic variables: T mean , RH, U 2 , and Rs. The author demonstrated that the MARS model performed the best using all climate variables with a R 2 nearly equal to 0.99 in the validation phase. Using four climatic variables, namely, T mean , RH, U 2 , and Rs, Kisi [54] compared MARS, M5Tree and least square support vector machines (LSSVM) for modeling monthly ET 0 in Turkey. The authors demonstrated that in some cases, MARS is superior over the two others in terms of performance accuracy. Keshtegar et al., [55] and Keshtegar and Kisi,[56] applied the M5Tree, ANFIS, and ANN for modeling daily ET 0 in Turkey, using T mean , RH, U 2 , and Rs as input variables. Kisi and Kilic [57] compared M5Tree and ANN for modeling daily ET 0 in USA, using T mean , RH, U 2 , and Rs. Rahimikhoob [58] compared M5Tree and ANN for modeling monthly ET 0 in USA, using T mean , RH, U 2 , and Ra, in Iran. The authors reported that both methods produced almost similar estimates with smaller differences.
The majority of the aforementioned studies have been conducted using several input variables. With the exception of the investigation by Mehdizadeh et al. [53], in which the MARS model was applied for modeling ET 0 using only temperature data as inputs, there are limited studies that have applied MARS and M5Tree models for ET 0 utilizing only temperature inputs. Hence, the major objective of the present investigation was to assess the performances of GMDHNN, MARS, and M5Tree models to estimate ET 0 using only temperature and extra-terrestrial radiation and validating the results against the empirical formulations (i.e., HS, CHS and SS). The main motivation of the current study is using specific climate data (temperature) to simulate the ET 0 , as this involves highly essential and significant factors influencing ET 0 , while recording of such data for long durations is an easy task in developing countries. The other difference of this study compared to previous ones is the use of different data splitting scenarios and the inclusion of periodicity (month number of the year) as an input to the GMDHNN, MARS, and M5Tree models.

Case Study
In the present study, monthly maximum and minimum temperatures (T max and T min ), solar radiation, relative humidity, and wind speed measured at Adana (latitude 37 • 00 N, longitude 35 • Table 1. R a has the highest correlation with ET 0 followed by the T min and T max and R a has a higher correlation with ET 0 in Antakya compared to Adana. It is also visible from Table 1 that ET 0 in Adana is higher than for Antakya. an easy task in developing countries. The other difference of this study compared to previous ones is the use of different data splitting scenarios and the inclusion of periodicity (month number of the year) as an input to the GMDHNN, MARS, and M5Tree models.

Case Study
In the present study, monthly maximum and minimum temperatures (Tmax and Tmin), solar radiation, relative humidity, and wind speed measured at Adana (latitude 37°00′ N, longitude 35°19′ E, altitude 27 m) and Antakya (latitude 36°33′ N, longitude 36°30′ E, altitude 100 m) stations in the Mediterranean Region of Turkey were utilized. The stations operated by the Turkish Meteorological Organization can be observed from Figure 1. The data periods used for the Adana and Antakya are 1968-2015 and 1983-2010, respectively. The statistical parameters of the data employed in the applications are summed up in Table 1. Ra has the highest correlation with ET0 followed by the Tmin and Tmax and Ra has a higher correlation with ET0 in Antakya compared to Adana. It is also visible from Table 1 that ET0 in Adana is higher than for Antakya.    T min , T max , R a, and ET 0 are minimum and maximum temperatures, extraterrestrial radiation, and reference evapotranspiration, respectively. x min , x max , x mean , S x, and C sx are minimum, maximum, mean, standard deviation, and skewness, respectively.

Group Method of Data Handling Type Neural Network
GMDHNN is a powerful machine learning tool based upon the principle of termination. In this principle, the system follows the one process through data importing, rearing, hybridizing, choice, and rejection. The GMDH algorithm is divided into two parts: one is the parameter and other is the non-parameter. If the variance is low, then parametric algorithms provide the best results and for high variance, non-parametric algorithms perform better. The GMDHNN model is capable of handling the multiple input variables and Sustainability 2021, 13, 297 5 of 21 provides single output. There are different layers in the GMDHNN model, which have a set of neurons; these neurons are further linked with quadratic polynomial in every layer, which provides the new neurons for the next layer [59][60][61]. The output of the database with multiple input variables and M numbers of observations is defined as below.
Here, (y 1 , y 2 , y 3 , y 4 , . . . , y M ) is the real input of the mapping f and m i is the real output based upon the real input. For the identification of the problem,f is considered as mapping instead of f to forecast them (output value) instead of m, and thism is close to m. The provided input (y 1 , y 2 , y 3 , y 4 , . . . , y M ) is used for the GMDHNN training to attain the outputm i as given below.
The GMDH-NN algorithm is working to minimize the MSE (mean square error) for making the model most effective for prediction. This MSE is calculated as E below to make the error level reach its minimum.
As we discussed above, neurons are connected with the quadratic polynomial, while Kolmogorov-Gabor polynomial is used to conduct relation mapping between input and output variables [53,55,56]. This Kolmogorov-Gabor polynomial can be expressed as below To minimize the variation between the actual output (m) and estimated output (m), a regression model is applied for each pair of input variables y i , y j .

Multivariate Adaptive Regression Splines
The Multivariate adaptive regression splines model (MARS) was proposed by Friedman [44] as a new data driven technique, looking for any possible nonlinear and nonparametric relationship which can exist and can be built between a set of inputs and output variables. MARS is used to try to identify and automatically establish the possible explicit regression equation between the regressors and the dependent variables in a stepwise manner; another important advantage of the MARS model is its abilities to provide the part of the contribution of each predictor to the dependent variable, and at the end of the training procedure it provides the final rankings of the regressors individually based on its rank [44]. A wide range of applications of the MARS model can be found in the literature including: estimating heating load in buildings [62,63], predicting centerline segregation in steel cast products [64], predictions of landslide susceptibility [65], estimating fractional snow cover (FSC) from MODIS data [66], and predicting monthly discharge and mean soil temperature [22,62]. Using the MARS model, the space of regressors is divided into several subspaces called knots, each has its own function and splines (segments) which are used to link all these knots, and all the spline are grouped to form a basis function (BF). Hence, globally speaking, the MARS model is based on three major clear and precise components: knots, spline, and BF, and the development of the model is achieved in two phases: forward (building) and backward (pruning) phases. In the forward phase, a high dimensional model is built that contains the chosen knots and their corresponding BF. During the backward phase, the BF that provides fewer contributions to the decreasing of the error is pruned via generalized cross-validation (GCV) [66].
Firstly, MARS starts by building a set of BF with the following equation [44]: where x is one of the regressor variables, c is the threshold value for the regressor x, and the BF is the basis function. Consequently, the MARS model is developed as an ensemble of the BF as follow: Y is the response (dependent) variable (ET 0 ), BF is the basis function, x is a regressor that contributing to the formation of the BF, and ψ m are unknown coefficients of the mth BF, while M is the total number of the BF [44,66]. The GCV is expressed as follow: (7) N is the quantity of the pattern, M presents the BF number, y i is the targeted variable (ET 0 ), f (x i ) is the predicted value of the pattern i, and c(M) is the penalty factor [67]. The MARS model was implemented utilizing the MatLab toolbox ARESLab [68].

M5 Model Tree
The M5 model tree (M5Tree), which is an amended version of the original decision tree (DT), was proposed by Quinlan [45]. DT was originally proposed for solving classification problems using a splitting method, for which the available information from the data is extracted via the construction of a tree composed of three kinds of nodes: the internal, the roots, and the leaves nodes [45]. The M5Tree has been used for solving several problems, such as predictions of energy consumption in buildings [69], air quality modeling [70], predicting liquefaction-induced lateral spreading [71], forecasting solar ultraviolet [72], and predicting daily water levels in rivers [73]. The M5Tree is a regression model in which the training data are being apportioned to smaller subsets through the construction of a tree and using a gain ratio criterion, an individual regression model is built for each subset [45]. Once the tree had been constructed, the training process starts and tries to determine the best separation to different subsets with respect to two conditions: (i) the leaves nodes of the tree only contains patterns from one subset or (ii) separation does not occur until any improvement in the gain ratio is observed. For a given n number of nodes leaves corresponding to k breaking points, each subset has its own linear model as follow [73]: where Y is the calculated ET 0 , x is one of the input variables selected for model development (climatic variables), λ 01 and λ 1i (i = 1:n) are the parameters of the linear models at n leave, and Z 1:k are the breaking points values. According to Quinlan [45], building an M5Tree model generally takes two major steps: the growth step (create a DT) and the tree pruning step to prune back an overgrown tree. The standard deviation reduction (SDR) statistical metric was used to compute the error at each node as the splitting criterion [74,75]: T i indicates the subset of the ith possible test, T represents the examples number reaching the node, and sd is the standard deviation of the observations. The M5Tree is applied utilizing the MATLAB toolbox M5PrimeLab [76].

Stephens-Stewart Model
Stephens and Stewart's [77] method is used for pan evaporation estimation. It can be expressed as E pan = R(a + bT mean ) where E pan is daily pan evaporation (mm/month), R is solar radiation (mm/month) at daily scale, and a and b refer the fitted parameters. In the present study, the SS method given in Equation (9) was used for ET 0 estimation by using extraterrestrial radiation instead of solar radiation data: ET 0 = Ra(a + bT mean ) (11) where ET 0 denotes the reference evapotranspiration (mm/month) and Ra refers the extraterrestrial radiation (mm/month).

Hargreaves and Samani Model
Hargreaves and Samani (HS) [78,79] is a temperature-based model and need only fewer input variables: extraterrestrial radiation (Ra) (mm/day), T max and T min ( • C): where T max and T min are the monthly maximum and minimum temperatures ( • C), respectively. The calibrated version of HS given in the following equation was also employed in this study: ET 0, calibrated = a + bET 0 (13) where a and b are fitted parameters.

Model Development by Heuristic Methods
In the presented study, three abovementioned heuristic methods were implemented for monthly ET 0 estimation. Three different data division scenarios: 50-50%, 60-40%, and 75-25%, were employed in the applications so as to see the effect of training/test size on models' accuracy. It is well-known that data-driven methods are highly affected by the size of the training data and that more data generally produce a better model. As also mentioned in the introduction section, the studies in the existing literature generally utilize four climatic inputs: air temperatures (T max , T min ), wind speed (U 2 ), relative humidity (RH), and solar radiation (SR) in ET 0 estimation. In developing countries, measurement of all these variables is always not possible and therefore models requiring a limited number of inputs are necessary in such cases. As was also reported by a recent review [1], future studies are required for developing new models with limited inputs. Keeping this in the mind, the following two input combinations were considered in this study: T min , T max , R a T min , T max , R a , α.
It is worth mentioning that the air temperature is easily available in every place and that R a can be calculated using the Julia date. The developed models are be useful in practical applications because they only need a smaller number of input variables. The periodicity (α, month number of the year) was also considered in the model input to see its influence on models' exactness if there is any. The flowchart provided in Figure 2 summarizes the model development procedure.
In the applications, GMDHNN, MARS, and M5Tree heuristic methods were employed to estimate monthly ET 0 while only utilizing temperature data as model inputs. Data of two stations, Adana and Antakya, were used for calibration of the methods. First, monthly ET 0 values were calculated by the FAO-56 PM formula using data of minimum and maximum temperatures, relative humidity, solar radiation, and wind speed following the guideline of Allen et al. [7]. Then, the obtained ET 0 data were used for the calibration and test of the selected models. The outcomes of GMDHNN, MARS, and M5Tree methods were compared with the empirical HS, calibrated HS (CHS), and SS regression methods.  Table 2 while considering RMSE, MAE, and NSE statistics. In the table, training and testing accuracies can be observed for three different train-test scenarios. In the three implemented heuristic methods, default structures were used and models were calibrated by introducing the training data; in case of the 1st, 2nd, and 3rd scenarios, 50%, 60%, and 75% of the whole data were utilized to obtain optimal parameters of the models. After calibration process, the calculated parameters of the GMDHNN, MARS, and M5Tree models were kept and they were directly used in the testing stage and models were validated by the test data; in case of the 1st, 2nd, and 3rd scenarios, 50%, 40%, and 25% of the whole data were utilized to assess the models' accuracies based on the three aforementioned statistics (RMSE, MAE, and NSE). GMDHNN2, MARS2, and M5Tree2 models were also developed by adding periodicity component (α) to the GMDHNN1, MARS1, and M5tree1 models so as to see its effect on models' accuracy in estimation ET0. It is obvious that there was not any considerable effect of α on models' exactness in this station. M5Tree models had better fitting in the training stage whereas the GMDHNN and MARS models has a superior performance to the M5Tree in the testing stage. GMDHNN has a better

Application and Results
The models were evaluated with respect to three commonly used statistics: root mean square error (RMSE), mean absolute error (MAE), and Nash-Sutcliffe efficiency (NSE) [80][81][82]. RMSE and MAE varied from 0 to positive infinity. RMSE and MAE outcomes equivalent to 0 indicate a perfect fit. NSE varies from negative infinity to 1 and 1 means that models perfectly catch the observed values. The expressions of the RMSE, MAE, and NSE are: In the equations, N is the quantity of data, ET 0 is average value of the reference evapotranspiration computed by FAO-56 PM, ET 0,iM is estimated ET 0 , and ET 0,i is computed reference evapotranspiration.
For Adana Station, GMDHNN, MARS, M5Tree, HS, CHS, and SS models are compared in Table 2 while considering RMSE, MAE, and NSE statistics. In the table, training and testing accuracies can be observed for three different train-test scenarios. In the three implemented heuristic methods, default structures were used and models were calibrated by introducing the training data; in case of the 1st, 2nd, and 3rd scenarios, 50%, 60%, and 75% of the whole data were utilized to obtain optimal parameters of the models. After calibration process, the calculated parameters of the GMDHNN, MARS, and M5Tree models were kept and they were directly used in the testing stage and models were validated by the test data; in case of the 1st, 2nd, and 3rd scenarios, 50%, 40%, and 25% of the whole data were utilized to assess the models' accuracies based on the three aforementioned statistics (RMSE, MAE, and NSE). GMDHNN2, MARS2, and M5Tree2 models were also developed by adding periodicity component (α) to the GMDHNN1, MARS1, and M5tree1 models so as to see its effect on models' accuracy in estimation ET 0 . It is obvious that there was not any considerable effect of α on models' exactness in this station. M5Tree models had better fitting in the training stage whereas the GMDHNN and MARS models has a superior performance to the M5Tree in the testing stage. GMDHNN has a better accuracy than the MARS but the difference is not too large. The calibration process considerably increases the HS performance in the estimation of ET 0 . Average statistics in Table 2 show that the GMDHNN and SS have almost the same performance and they show a superior performance to the other models with respect to three criteria. The relative differences between the GMDHNN/SS and MARS2 models with respect to average RMSE and MAE are 2.9% and 3%, respectively. Detailed results indicate that a slight difference exists between periodic MARS (MARS2) and SS models in 50-50% and 60-40% train-test scenarios, while the latter performs better than the first in a 75-25% scenario. These results tell us that the use of one data-splitting scenario may mislead modelers during evaluation of the methods' accuracy. The methods are also compared in Figure 3 with respect to RMSE and NSE in the testing stage. The variation of the criteria (RMSE, NSE) with respect to different splitting scenarios is parallel to each other for all of the applied methods. NSE decreases and RMSE slightly increases from the first splitting scenario (50-50%) to the third scenario (75-25%).    Table 3 reports the training and testing statistics of the employed methods for Antakya Station. Unlike the Adana Station, including the periodicity input considerably improves the accuracy of MARS and M5Tree methods in the testing stage. Similar to for the previous station, here temperature based GMDHNN models also show superior performance to the MARS and M5Tree models in the estimation of ET 0 . All heuristic methods outperform the SS method. A considerable improvement is observed for the HS method after calibration: RMSE and MAE are increased from 1.715 mm and 1.557 mm to 0.655 mm and 0.541 mm with respect to average statistics, respectively. The SS model has better accuracy than the HS and CHS models in the estimation of ET 0 using only temperature data as inputs. The relative differences between GMDHNN2 and MARS2/M5Tree/SS models with respect to average RMSE and MAE are 3.7%/10.7%/0.8% and 4%/11.7%/1.1%, respectively. The results of the Antakya Station suggest the use of a periodicity input in model development. The RMSE and NSE values of the applied methods are also compared in Figure 4 for the testing stage. Here the criteria also vary in similar ways for all the methods except for the CHS. Unlike Adana Station, the NSE slightly increases and RMSE decreases from the 50-50% splitting scenario to a 75-25% scenario. Comparison of the two stations (compare Figures 3 and 4 or Tables 2 and 3) reveals that the models generally provide better estimates for Antakya Station compared to Adana. A higher correlation between the inputs (T min , T max , R a ) and output (ET 0 ) in Antakya compared to Adana may be the reason for this.     Figure 5 illustrates the FAO-56 PM and estimated ET 0 obtained by using six different methods for Adana Station. It is apparent from the scatterplots that the HS considerably overestimates ET 0 while the CHS has less scattered estimates compared to HS. GMDHNN and SS methods have the least scattered estimates among the applied methods and are closely followed by the MARS method. The methods are graphically compared in Figure 6 in estimation of ET 0 of Antakya Station. Here the CHS also considerably improves the HS accuracy. GMDHNN also has the least scattered estimates followed by the MARS in this station.  The monthly mean estimates of the GMDHNN, MARS, M5Tree, HS, CHS, and SS methods are compared in Figure 7. In Adana Station, the GMDHNN2, MARS2, M5Tree1, and SS model results are generally very close to each other while the CHS underestimates the ET 0 of March and May and overestimates those of July and August. The models' estimates do not considerably change with respect to splitting scenarios. In Antakya Station, however, the models' accuracy changes for different train-test scenarios. For example, the GMDHNN2, MARS2, M5Tree1, and SS models are less successful in estimation of ET 0 in the case of the 50-50% splitting scenario while the 75-25% train-test scenario provides the best estimates. This also confirms the necessity of considering different splitting scenarios in evaluating the accuracy of the applied methods in the estimation of ET 0 . It is apparent that the CHS has the worst estimates while the GMDHNN2 maps the mean monthly ET 0 better than the other models. All the models underestimate ET 0 of Antakya Station in the 50-50% splitting scenario.

Conclusions
The ability of new temperature based regression methods were compared with Hargreaves-Samani, calibrated Hargreaves-Samani, and Stephens-Stewart methods in modeling monthly reference evapotranspiration. The applied models only used maximum and minimum temperatures and extraterrestrial radiation inputs that were measured and calculated for two stations in Turkey. Data division scenarios of 50%-50%, 60%-40%, and 75%-25% were applied in the study to evaluate the aforementioned methods. The periodicity component (the month number of the year varying from 1 January to 12 December)

Conclusions
The ability of new temperature based regression methods were compared with Hargreaves-Samani, calibrated Hargreaves-Samani, and Stephens-Stewart methods in modeling monthly reference evapotranspiration. The applied models only used maximum and minimum temperatures and extraterrestrial radiation inputs that were measured and calculated for two stations in Turkey. Data division scenarios of 50-50%, 60-40%, and 75-25% were applied in the study to evaluate the aforementioned methods. The periodicity component (the month number of the year varying from 1 January to 12 December) was also used as an input to the models so as to examine its effect on models' performances.
Three commonly used criteria: RMSE, MAE, and NSE, were used for comparison of the methods. The results indicated the following conclusions: In Adana Station, GMDHNN and the SS model performed better than the other models. In Antakya Station, however, the GMDHNN model provided the best accuracy followed by the MARS and M5Tree in modeling monthly reference evapotranspiration.
The calibration procedure considerably increased HS model accuracy. For example, average RMSE and MAE statistics of HS were increased from 1.715 mm and 1.557 mm to 0.655 mm and 0.541 mm for Antakya Station.
The periodicity component increased the accuracy of GMDHNN, MARS, and M5Tree models in Antakya Station only. RMSE decrements of the GMDHNN, MARS, and M5Tree models in the test stage were 1.4%, 8%, and 6%, respectively.
The applications revealed the necessity of using different data division scenarios for better evaluation of the compared models.
Comparison of the models in estimating monthly mean reference evapotranspiration revealed that the GMDHNN model generally had better accuracy compared to other models while the CHS models provided the worst estimates. By implementing the GMDHNN model, the average RMSE of MARS, M5Tree, HS, CHS, and SS models respectively decreased by 3.7-6.4%, 10.7-3.9%, 76-75%, 10-35%, and 0.8-17% when estimating monthly ET 0 .
The results of this study recommend the use of the GMDHNN model for the prediction of ET 0 in regions where only the temperature is available while other meteorological data are not available or are missing for a long duration.