Reference Evapotranspiration Modeling Using New Heuristic Methods

The study investigates the potential of two new machine learning methods, least-square support vector regression with a gravitational search algorithm (LSSVR-GSA) and the dynamic evolving neural-fuzzy inference system (DENFIS), for modeling reference evapotranspiration (ETo) using limited data. The results of the new methods are compared with the M5 model tree (M5RT) approach. Previous values of temperature data and extraterrestrial radiation information obtained from three stations, in China, are used as inputs to the models. The estimation exactness of the models is measured by three statistics: root mean square error, mean absolute error, and determination coefficient. According to the results, the temperature or extraterrestrial radiation-based LSSVR-GSA models perform superiorly to the DENFIS and M5RT models in terms of estimating monthly ETo. However, in some cases, a slight difference was found between the LSSVR-GSA and DENFIS methods. The results indicate that better prediction accuracy may be obtained using only extraterrestrial radiation information for all three methods. The prediction accuracy of the models is not generally improved by including periodicity information in the inputs. Using optimum air temperature and extraterrestrial radiation inputs together generally does not increase the accuracy of the applied methods in the estimation of monthly ETo.


Introduction
For irrigation management, agricultural processes, and hydrological cycles, the reference evapotranspiration (ETo) is considered as one of the fundamental variables that should be accurately enhancement for the ANFIS model has been developed, which is defined as a dynamic evolving neural-fuzzy inference system (DENFIS) in order to be more suitable for dynamic time series prediction applications [27,28]. Conceptually, the DENFIS is designed to generate and update new fuzzy rules during the learning of the system. Such a procedure allows the DENFIS to calculate the desired output according to them-most activated rules, which have been chosen dynamically from the set of the fuzzy rules [27][28][29]. It could be noticed from the DENFIS structure and procedure and the nature of ETo that the DENFIS model has the potential to adequately and effectively estimate the ETo. To the knowledge of the authors, there is no study in the literature related to the application of DENFIS in modeling ETo.
More recently, LSSVR has been known as a useful tool for several engineering prediction applications. LSSVR has a simple and effective structure; however, it includes three unknown parameters that should be estimated and initialized at the beginning, which is considered one of the major disadvantages of the LSSVR. Kisi [19] showed that the more accurate the estimation of the LSSVR's unknown parameters, the more accurate the estimation of the desired variable. However, the most common method to estimate these three unknown parameters is based on trial and error, as in many previous studies. One innovation for the development of a successful LSSVR model is to be integrated with a suitable optimization algorithm that could be able to search for the optimal estimation values for these three unknown parameters [19]. It has been reported that the gravitational search algorithm (GSA) is a useful optimization algorithm which is comparatively fast in convergence and free from trapping in local minima over the other optimization algorithms [30]. Therefore, integrating the LSSVR with (GSA) as an efficient optimizer for searching for the optimal values of the LSSVR's unknown parameters could be a suitable solution to enhance the LSSVR's ability to estimate the desired ETo. Available literature indicates that new methods are essential to enhancing ETo estimation preciseness and decreasing the model's uncertainty. For this reason, novel hybrid heuristic soft computing methods were examined in this study for performing effective evapotranspiration modeling. A number of studies [31][32][33][34][35] have compared one or more of the hybrid heuristic models using several meteorological variables at different time steps, and the results obtained have shown that hybrid heuristic soft computing methods generally provided good accuracy. However, even though the aforementioned studies demonstrated successful applications of hybrid heuristic soft computing models, the most important point to note is that, generally speaking, these models do not use fewer independent input variables; instead, they require several inputs-i.e., U, RH, SH, Tmean, Tmax, and Tmin-and among them the U, RH, SH, and SR are the meteorological variables reported as the most significant factors influencing ETo and of primary importance. To the author's knowledge, there is not any published study that investigates the potential of the LSSVR-GSA method in modeling ETo. First, we developed the hybrid of GSA and LSSVR for modeling ETo. Second, we used the already applied standalone soft computing method (M5RT) with only average temperature input, whereas most of the other studies in the literature used a mix of the Tmax and Tmin and several other meteorological variables during applications of such models. Third, we used extraterrestrial radiation which is easily obtained from the Julian date and latitude information as input to the models. Fourth, we demonstrated the effect of periodicity (month number) as an input variable for ETo estimation using limited climatic inputs.
In the current study, an attempt to develop a model to estimate the ETo based on a minimal number of input variables has been proposed utilizing advanced machine learning methods. LSSVR-GSA, DENFIS, and M5RT models have been developed to estimate the ETo considering only the temperature as an input variable. Data from three stations in the Jinsha river basin of China have been used in order to evaluate and examine all the proposed models. Additionally, in-depth, comprehensive analyses for the performance of all models have been accomplished in order to elucidate the logic and the cause behind each model's estimation accuracy.

Case Study
The study uses monthly average temperatures (T), extraterrestrial radiation (Ra), and evapotranspiration (ETo) data of three meteorological stations, represented as 56004 (latitude 34.21 N, longitude 92.43 E), 56021 (latitude 34.13 N, longitude 95.78 E), and 56029 (latitude 33.01 N, longitude 97.01 E,) stations situated in the Jinsha river basin of China. The location of each station can be seen in Figure 1. These stations are operated by the China Meteorological Administration (CMA). Jinsha river basin consists of three provinces (Qinghai, Yunan, Sichuan, and Tibet Autonomous) covering the 473.2 × 10 3 km 2 drainage area of China. The Jinsha river basin has a very vital role in regional and national economic development due to its contribution to irrigation, water supply, flood control, wood drift, tourism, and plentiful hydropower resources (58,060 MW). The mean annual rainfall in the Jinsha river basin is 750 mm, 90% of which occurs from May to October. The basin selected area belongs to humid warm temperate, which is the primary source of water, having annual mean rainfall of about 1200 mm with mean annual potential evapotranspiration of about 1350 mm. In the study, the data periods of three meteorological stations from 1961 to 2012 were used. Available data were divided into three parts as training (50% of the whole data covering 1961-1986), validation (25% of the whole data covering 1986-1999), and testing (25% remaining part covering 2000-2012). The brief statistical characteristics of the user data are summed up in Table S1. Evapotranspiration data of three meteorological stations are calculated using the FAO-56 PM method. The detailed procedure of the FAO-56 PM method is described by Allen et al. [36] and She et al. [37].

Dynamic Evolving Neural-Fuzzy Inference System (DENFIS)
DENFIS is a neural-fuzzy modeling approach that evolves through gradual hybrid learning procedures resulting from the tuning of local elements regarding the new data entry to the model-specific categorization [27]. DENFIS is an extension of the evolving fuzzy neural network (EFuNN), which uses a clustering approach called the evolving clustering method (ECM) [38]. Dynamic features of the EFuNN make DENFIS useful for online adaptive systems also. ECM itself is an online fast-evolving clustering method that considers the maximum distance between points and the cluster's center.
ECM is based on triangular membership functions, which serve as fuzzy sets and the utilization of an alternative weighting scheme for local learning of resulting parameters. Usually, the input and output neurons can be fuzzified by a fuzzy quantization approach [39]. The offspring fuzzy rules are consequently generated and updated while the system is in operation. The outputs resulting from DENFIS are computed through FIS-based activated fuzzy rules, which are dynamically chosen from a given fuzzy set of rules. Notably, the ECM used in the DENFIS interface utilizes a scatter partitioning of the input space to create fuzzy inference rules [38]. Uses of ECM improve the computational performance for the dynamic estimation of the cluster's quantity in the given dataset by facilitating the finding of their current centers in the input space. A typical structure and detailed description of the EFuNN interface based on ECM from which the DENFIS approach is derived can be found at [27]. DENFIS uses a typical model architecture, which is based on the Takagi-Sugeno type fuzzy inference engine [34]. In this regard, DENFIS utilizes a model-based approach called the "lazy" learning approach. According to this approach, the fuzzy network estimates the position of each input vector in the attribute space and forms accordingly; meanwhile, a fuzzy inference system which predicts the output through a dynamic process is created during the incremental learning.
The classical "lazy" learning process of the fuzzy network utilizes a sample-based approach, wherein a small local model on demand is constructed regarding local samples taken from the closest query point. The learning process and governing equations used in the DENFIS model and applied in this study are described in detail in [39]. DENFIS models are being applied in different water resource-related areas and other disciplines as well. The use of DENFIS in rainfall-runoff modeling showed that results attained from the local learning model were significantly better than results achieved from physically-based models [38]. In another study related to evaporation modeling based on limited meteorological data, it was found that DENFIS models increased the accuracy of the outputs significantly [39]. Similar results were obtained in other studies as well [40]. DENFIS models were recently applied to predict solar radiation [41]; the results showed that DENFIS provides faster and much more accurate results compared to the other neuro-fuzzy models.

Least Squares Support Vector Regression (LSSVR)
Least squares support vector regression was initially proposed by [42]; it is a version of the support vector regression (SVR) algorithm improved by adopting the loss function differently from SVR and minimizing the square error. In the LSSVR interface model, the inequality constraints have been replaced with equality constraints by transforming quadratic programming issues into a linear equation to overcome the calculation issue of the large-scale dataset [43]. SVR itself is well known, and a robust algorithm applied in different areas mainly for data regression and also to overcome the over-fitting issue. LSSVR differs from SVR, mainly due to the types of functions used by each of them. LSSVR handles square errors instead of non-negative errors, and it uses equality constraints instead of inequality constraints, as opposed to the conventional SVR [44].
Similarly, to the DENFIS, LSSVR performs the training procedure in terms of solving a linear system instead of using a quadratic programming problem, which thus significantly improves the computation time and the accuracy of the model learning ability [45]. Figure 2 illustrates the structure of a typical LSSVR model. A detailed explanation about LSSVR applications, parameters, and respective governing equations is given by Yuan et.al [44]. LSSVR is widely applied in solving challenging problems in different areas. Wu and Peng [43] used LSSVR to improve the prediction performance of the potential wind power; they found that LSSVR can enhance the accuracy of up to 20% compared to other single or hybrid models. A similar conclusion was obtained by Lu et al. [46] who applied LSSVR for short-term forecasting of wind power. The application of the LSSVR has shown promising results in the prediction of nanoparticles as well [47].

Gravitational Search Algorithm (GSA)
GSA, which is based on the law of gravity and motion proposed initially by Rashedi et al. [30], represents one of the most effective optimization algorithms compared to other evolutionary algorithms. In GSA, each nod is characterized by four parameters: position, inertial mass, gravitational mass, and velocity. The location of each node corresponds to a solution of the problem, while the gravitational and inertia masses for each node were obtained utilizing a fitness function [48]. The location of the nodes can be expressed as follows: where x k i represents the location of the ith node for the kth dimension. The mass of each node is computed after calculating the fitness of the given population as follows: where fit i (t) and M i (t) represent the fitness value and mass of the ith node at time t, respectively, whereas best(t) and worst(t) are the minimum fitness value and maximum fitness value, respectively, the gravitational acceleration of the node i, is calculated as follows: firstly, the force exerted by a large node on the node i is computed.
where M i (t) and M j (t) are the passive and active gravitational mass, respectively, corresponding to nodes i and j at the t generation; Gc(t) and ε are gravitational and small constants; xk i (t) and xk j (t) indicate the positions of the kth dimensions of nodes i and j at the t generation; and xi(t), xj(t) is the Euclidean distance between nodes i and j. The total gravitational acceleration of the ith node was calculated using the law of motion as follows: where a k i (t) represents the gravitational acceleration of the node i in the kth dimension, and the rand represents a random variable with uniform distribution within the interval (0,1). The total gravitational force exerted on the node i in the kth dimension is calculated as follow: Afterward, the speed and location of each node were updated as follows: As highlighted above also, GSA utilizes the gravitational force as the direct form with which to communicate the cooperation of the nodes. The heavy nodes in GSA are processed, infer reasonable solutions, and move more gradually than lighter ones. Thus, the GSA searches for the ideal solution by appropriately calibrating the inertia and gravitational masses of nodes where every node provides a specific solution.

HLGSA (Hybrid LSSVR-GSA)
The combination of LSSVR with the gravitational search algorithm (GSA) has been reported to improve the forecasting accuracy and computational performance [43] significantly. To our knowledge, there are only a few studies applied to hybrid models, such as combining GSA with other ANN models. Yuan et al. [44] applied LSSVM-GSA to predict day-ahead wind power; they concluded that GSA provided higher accuracy and improved the computational duration. Similar conclusions were reported by Lu at al. [46] and Sun at al. [47] in the case of PM 2.5 concentration prediction. The process of the evapotranspiration prediction model HLGSA using the hybrid of LSSVR and GSA methods consists of the following steps:

1.
Firstly, divide meteorological datasets into training, validation, and testing parts.

2.
Second, select the RBF kernel function and initial parameters for the HLGSA method to build the initial LSSVR model. The initial values of the parameters are set as follows: the range of penalty factor γ is 0.1 to 1500, the range of RBF parameter σ 2 is 0.001 to 10, the iteration range is 20-30, the number of particles can be set up to 30, and constant alpha was found to perform better in range of 12-18, whereas initial gravitational constant G 0 was found to perform better in the range from 102 to 120.

3.
Third, compute the particle fitness value of each node. The RMSE is selected as the fitness function in this study. The fitness function of this method is defined as follows:

4.
Fourth, chose the best parameter combination through GSA to obtain the optimal values for the LSSVR parameters.

5.
Fifth, utilize the new combination of parameters to reconstruct the LSSVR if it does not meet the stopping criterion. 6.
Last, the optimal LSSVR model for forecasting evapotranspiration was built based on the typical parameter values.

M5 Model Tree
M5 model tree (M5RT) is a novel neural approach developed initially by Quinlan [49] to help in solving continuous class learning problems. This model is based on a binary tree that serves as the backbone for the model itself. An M5 model tree consists of a linear regression function applied to terminal leaf nodes to describe the relationships of independent and dependent variables. Since the M5 model tree is a quantitative data focus model, this feature increases the accuracy, and as a result, the importance compared to other models [50]. In general, the principle of the tree-based model consists of "divide-and-conquer" approach for constructing a relationship between independent and dependent variables or input and output parameters. They can also be used for qualitative and quantitative data assessment [51]. All model trees can effectively learn and succeed in tasks with very high dimensionality. The main advantage of the M5 model tree compared to regression trees and other models is that it is much smaller than regression trees [52]. Additionally, the decision strength is evident, and the regression functions do not usually contain abundant variables, which in some cases can reduce the computational performance.
M5 model tree consists of two steps. The first step has to do with the splitting of the datasets into subsets. Splitting process: it often creates an excessive tree-like structure that leads to overfitting. In the second step, the overrun tree is trimmed; then trimmed subtrees are replaced with linear regression functions (Figure 3). M5 model tree has found many applications in water resource-related areas. Rahimikhoob et al. [53] applied the M5 model tree to compute evapotranspiration (ETo) in a semi-arid region. They found that the M5 model tree provides more robust results compared to other conventional or empirical methods. However, for the arid zone, Rahimikhoob [54] found that ANN estimated ET0 better than the M5 model tree. Nevertheless, both models performed in agreement for that study case, and the results were very similar to the results obtained from the FAO56-PM method. Kisi and Kilic [52] estimated ETo considering several empirical and neural-fuzzy approaches, including the M5 model tree. They found that the M5 model tree showed better accuracy, particularly comparing to the empirical models.

Application and Results
The potential of new machine learning methods, LSSVR-GSA and DENFIS, was investigated by applying them to temperature (T) and extraterrestrial radiation (Ra) data of three stations located in China, and the results were compared with the well-known M5RT method. The models' accuracy was evaluated based on three statistical criteria: root mean square error (RMSE), mean absolute error (MAE), and determination coefficient (R 2 ). The RMSE and MAE can be defined as: where N is the number of datapoints used, ET ip is predicted ETo, and ET im is ETo calculated by FAO 56 PM. Table S2 sums up the validation and test results of the and M5RT models concerning different input combinations comprising previous values of temperature (T) and extraterrestrial radiation (Ra) for the first station (56004). In Table S2, T t and Ra t refer to the mean air temperature and extraterrestrial radiation at time t and vice versa. In the tables, t represents the month which ETo needs to be predicted, whereas t-1 represents the previous month.
It is apparent from Table S2 that the LSSVR-GSA has slightly better accuracy than the DENFIS, and they both perform superiorly to the M5RT in modeling monthly ETo. There are not large differences between the optimal T-based and Ra-based models; the RMSE differences are 12.6%, 12.9%, and 5.9% for the LSSVR-GSA, DENFIS, and M5RT, respectively. It is worth noting that Ra-based models only use Julian date and do not require temperature measurement.
In Table 1, the model results are presented for the optimal (best) inputs, including periodicity input (α), which indicates the month number of the output. In the table, Opt T and Opt Ra refer the optimum T and optimum Ra inputs, which provided the best accuracy in the test stage (see Table S2); Opt T based models for LSSVR-GSA, DENFIS, and M5RT are found as T t , T t-1, T t-2 , T t , T t-1, T t-2, T t-3 , T t , T t-1, and T t-2 respectively. For the optimal Ra input combination, Ra t , Ra t-1, Ra t-2, and Ra t-3 input combination provided the best results for DENFIS and LSSVR-GSA models, whereas, Ra t , Ra t-1, Ra t-2 provided best results for the M5RT model. As clearly observed from Table 1, the models' performances are also examined by considering both optimum temperature and extraterrestrial radiation inputs. As evident from the table, including periodicity input marginally improves the models' accuracy, and combining optimal T and optimal Ra inputs worsens the performances of the three methods in prediction of monthly ETo.
Model estimates of each method with Opt T and Opt Ra inputs in the test period are compared in Figure 4a,b in the form of hydrograph for the first station (56021). It can be seen from the zoomed sections that the LSSVR-GSA is closer to the FAO 56 PM ETo than the DENFIS and M5RT models. Figure 5 makes a scatterplot comparison of the model estimates. As seen from the graphs, LSSVR-GSA has less scattered points, closely followed by the DENFIS model, for both temperature and extraterrestrial radiation inputs.   Validation and test statistics of the three applied methods are reported in Table S3 for the prediction of monthly ETo of the second station (56021). It is observable from Table S3 that the temperature-based LSSVR-GSA models outperform the corresponding DENFIS and M5RT models. The best LSSVR-GSA model has lower RMSE (0.230 mm) and MAE (0.179 mm) and higher R 2 (0.956) compared to the other two best models. In the case of extraterrestrial radiation input, however, LSSVR-GSA has slightly better accuracy than the DENFIS and M5RT models. It is worth noting here that Ra-based models perform superiorly regarding the T based models, except LSSVR-GSA, so that there is a marginal difference between R-based and T-based models. This result caries practical importance because R-based models only use Julian's date, and they can predict monthly ETo without climatic data. For station 56021, the optimal T inputs were found T t , T t-1, and T t-2, for LSSVR-GSA and M5RT models, whereas they were T t , T t-1, T t-2, and T t-3 for DENFIS model.
The optimal models with the periodicity information are compared in Table 2 for the second station (56021). In this station, also adding periodicity information slightly improves prediction accuracy, and in some cases, it negatively affects the models' accuracies (e.g., see DENFIS with Opt Ra input and M5RT with Opt T and Opt T, Opt Ra inputs). Here, also combining optimal temperature and extraterrestrial radiation inputs does not improve the models' exactness compared to optimal T and/or optimal Ra input-based models. Table 2. Validation and test statistics of the models in monthly ETo prediction using the optimal inputs of T, Ra, and α-Station 2 (56021).  Figure 6a,b illustrate the ETo estimates of three methods with Opt T and Opt Ra inputs in the test period for the second station (56021). As can be seen, the LSSVR-GSA estimates are closer to the FAO 56 PM ETo than the DENFIS and M5TR models. It can be said that the fluctuations of the model estimates are very close to each other in the case of Opt Ra input (Figure 6b). This confirms the statistics provided in Table 2. Scatterplot comparison of the model estimates is shown in Figure 7. It is evident from the graphs that the LSSVR-GSA has less scattered estimates compared to the other two models and is closely followed by the DENFIS. In the case of Ra input, the distributions of all three models are similar to each other.   Table S4 gives the validation and test statistics of the LSSVR-GSA, DENFIS, and M5RT models for prediction of monthly ETo of the third station (56029) using previous values of temperature and extraterrestrial radiation to find their optimal input combinations. In this station, the LSSVR-GSA method performs better than the DENFIS and M5RT methods; however, there is a slight difference among the Ra-based methods, similarly to the previous station (56021). Among the all models, the LSSVR-GSA with T t , T t-1 , T t-2 , and T t-3 inputs has the lowest RMSE (0.230 mm) and MAE (0.172 mm) and the highest R2 (0.954) in monthly ETo prediction. Table 3 compares the optimal T-and R-based models, together with periodicity input (α). For the LSSVR-GSA method including α generally, it worsens the prediction accuracy, while the performances of DENFIS models are considerably improved by importing periodicity input; the most considerable improvement is 25% for the model with Opt T input. Combining optimum T and optimum Ra inputs also considerably improve the prediction accuracy of DENFIS; improvements are 22% and 10% compared to Opt T and Opt Ra inputs. For this station, it can be concluded that the DENFIS model with Opt T, α input performs better than the other models. Figure 8a,b provide the ETo estimates of three methods with Opt T and Opt Ra inputs in the test period for the third station (56029). In this station, the difference between LSSVR and the other two methods is more clearly observed, mainly for the optimum temperature inputs. Figure 9 compares the models' estimates concerning their scatterplot distributions. The LSSVR-GSA has the estimates which are closer to the FAO 56 PM ETo with less scattered distribution compared to DENFIS and M5RT models. In the case of Ra input, the other two models and are closely followed by the DENFIS. In the case of Ra input, however, the distributions of all three models are very close to each other. Table 3. Validation and test statistics of the models in monthly ETo prediction using the optimal inputs of T, Ra, and α-Station 3 (56029).  Figure 8a and 8b provide the ETo estimates of three methods with Opt T and Opt Ra inputs in the test period for the third station (56029). In this station, the difference between LSSVR and the other two methods is more clearly observed, mainly for the optimum temperature inputs. Figure 9 compares the models' estimates concerning their scatterplot distributions. The LSSVR-GSA has the estimates which are closer to the FAO 56 PM ETo with less scattered distribution compared to DENFIS and M5RT models. In the case of Ra input, the other two models and are closely followed by the DENFIS. In the case of Ra input, however, the distributions of all three models are very close to each other.

Model Inputs
(a)

Conclusions
The potentials of two new machine learning methods, least-square support vector regression with a gravitational search algorithm and the dynamic evolving neural-fuzzy inference system, for modeling reference evapotranspiration using only temperature data, were investigated in the study, and estimates of the new methods were compared with the well-known M5 model tree method. Previous values of air temperature and extraterrestrial radiation were tried as inputs to the models to estimate monthly ETo. Three stations from China were used as case studies. The results obtained from the applications provided the following conclusions: 1.
In all three stations, the temperature or extra-terrestrial radiation-based LSSVR-GSA models performed superiorly to the DENFIS and M5RT models in estimating monthly ETo. In some cases, especially for the Ra based models, however, a slight difference was found between the LSSVR-GSA and DENFIS methods, while the M5RT provided the worst estimates.

2.
The results revealed that the only extra-terrestrial radiation input might provide better prediction accuracy for all the three methods, and this implies that the monthly ETo can be easily calculated with only Julian date or without temperature information.

3.
Importing periodicity information to the model's inputs generally improved the prediction accuracy, and the accuracy of DENFIS was considerably increased in the third station (56029).

4.
Combining optimum air temperature and extra-terrestrial radiation inputs generally did not increase the accuracy of the methods in terms of estimation of monthly ETo. In one station (third station), however, the combination of both inputs improved the accuracy of DENFIS methods by 22% and 10% (RMSE) compared to optimum T and optimum Ra inputs, respectively. In this station, this method performed better than the LSSVR-GSA and M5RT.
The results of this study recommend the use of the LSSVR-GSA model as an efficient tool for estimating monthly ETo using only temperature or extraterrestrial radiation. This is very important in practical applications, especially for the developing countries, where some climatic data may be missing or absent because of technical reasons or lack of opportunities.
Supplementary Materials: The following are available online at http://www.mdpi.com/1099-4300/22/5/547/s1. Table S1: The statistical parameters of the applied data. Table S2: Validation and test statistics of the models for monthly ET0 prediction-Station 1 (56004). Table S3: Validation and test statistics of the models for monthly ET0 prediction-Station 2 (56021). Table S4: Validation and test statistics of the models for monthly ET0 prediction-Station 3 (56029).