Evaluation of Machine Learning Versus Empirical Models for Monthly Reference Evapotranspiration Estimation in Uttar Pradesh and Uttarakhand States, India

.


Introduction
For optimal utilization of scarce water resources, an accurate estimation of crop evapotranspiration (ETc) is crucial for running large irrigation systems by enhancing the water application efficiency [1,2].Moreover, the ETc plays an important role in acquiring knowledge about the appropriate management of water resources, irrigation scheduling, crop water use, crop production, and water conservation [2].Usually, ETc is estimated by computing the reference evapotranspiration (ETo) and then multiplying ETo with Kc (crop coefficient) [3,4].Therefore, the ETo is the key factor to improve irrigation and water use efficiencies [5].Accordingly, the Penman-Monteith (PM) model was introduced by the FAO (Food and Agriculture Organization) and is considered a benchmark model for ETo computation [6].The FAO-56 PM model requires numerous climatic parameters to estimate ETo, which are often incomplete or unavailable, especially in developing countries [7,8].Hence, the best possible alternatives are necessary to implement, requiring less climatic data for ETo estimation [9].
In the last decade, the aforesaid issues have been tackled by the ML models to estimate ETo with limited climatic variables on different time scales [10][11][12][13].Some of the ML models such as SVM, RF, M5P, ELM (extreme learning machine), ANN (artificial neural network), ANFIS (adaptive neuro-fuzzy inference system), XGBoost (extreme gradient boosting), MARS (multivariate adaptive regression splines), and GEP (gene expression programming) received the massive application in ETo estimation [14][15][16][17][18][19].The results of these studies report the better performance of ML models in comparison to empirical models.Apart from that, the ML models have become popular in modelling watershed hydrology [20,21].Furthermore, Ashrafzadeh et al. [22] employed the SVM, GMDH (group method of data handling), and SARIMA (seasonal autoregressive integrated moving average) techniques to estimate the monthly ETo in the Guilan Plain of Northern Iran.They noted the better feasibility of the SVM, GMDH, and SARIMA models in the study region.Chen et al. [10] applied six ML models including DNN (deep neural network), TCN (temporal convolution neural network), LSTM (long short-term memory), SVM, and RF, and seven empirical models, i.e., Hargreaves, modified Hargreaves, Ritchie, Priestley-Taylor, Makkink, Romanenko, and Schendel, to estimate the daily ETo on the Northeast Plain of China.The results of the investigation demonstrate that the ML models performed superior to the empirical models.Mehdizadeh et al. [23] coupled ANFIS with SFLA (shuffled frog leaping algorithm) and IWO (invasive weed optimization) algorithms for estimation of daily ETo at the Tabriz and Shiraz stations of Iran.The performances of the ANFIS-SFLA and ANFIS-IWO models were compared with the Priestley-Taylor, Hargreaves-Samani, Romanenko, and Valiantzas models, and noted that the ANFIS-IWO model provides better estimates than the other models.Adnan et al. [24] estimated monthly ETo at the Dhaka and Mymensing stations of south-central Bangladesh using the ANFIS-MFO (moth flame optimization), ANFIS-WCA (water cycle algorithm), and AN-FIS-WCOMFO models.Results of the evaluations reveal that the hybrid ANFIS-WCOMFO model performed superior to the other models.
In a related context recently, several nature-inspired algorithms have been embedded with ML models to optimize their performance in ETo estimation [25].Alizamir et al. [1] estimated monthly ETo at two sites (Antalya and Isparta) placed in Turkey by employing the hybrid of the ANFIS-PSO (particle swarm optimization) and ANFIS-GA (genetic algorithm) against the classical CART (classification and regression tree), ANN, and ANFIS models.They reported that the hybrid ANFIS-PSO and ANFIS-GA models produce better estimates than other models at both stations.Maroufpoor et al. [26] applied the hybrid ANN-GWO (grey wolf optimizer) for estimating the monthly ETo in five different climates (i.e., arid, semi-arid, hyper-arid, humid, and sub-humid) of Iran.The efficacy of ANN-GWO was compared against the ANN and LSSVR (least square support vector regression) models, and found that the hybrid ANN-GWO model was more efficient than other models in all climates.Rezaabad et al. [27] predicted the daily ETo in the Kerman province of Iran by coupling the ANFIS with the IWO (weed optimization algorithm), ICA (imperialist competitive algorithm), TLBO (teaching-learning-based optimization), and BBO (biogeography-based optimization) algorithms.They found that the ANFIS-ICA model with EC = 0.98, RMSE = 0.50 mm/day and CC = 0.99 was superior to other models.Chia et al. [28] optimized the ELM with three nature-inspired algorithms, namely PSO, MFO (mothflame optimization), and WOA (whale optimization algorithm) for estimating daily ETo at the Sibu, Miri, and Sandakan sites (Malaysia).Results showed that the ELM-WOA models outperformed the other models at all locations with RMSE of 0.0011 to 0.1927 mm/day, MAE of 0.0007 to 0.1443 mm/day, and R 2 (determination coefficient) of 0.9486 to 1.0000.
However, these studies also support the better viability of the ML models enhanced with numerous nature-inspired algorithms.
From the above-mentioned literature, it was noted that several studies have been conducted on ETo estimation on different time scales in different climates.However, according to our knowledge, so far, the support vector machine (SVM), M5P model tree (M5P), random forest (RF), and empirical models (i.e., Valiantzas-1, Valiantzas-2, Valiantzas-3) were not used for monthly ETo estimation at the Nagina and Pantnagar stations.Thus, this study was optimized with the specific objectives as (i) to formulate the three ML models, i.e., SVM, M5P, and RF, for monthly ETo estimation at both locations, and (ii) to compare the efficacy of three ML models against the empirical models based on statistical and graphical investigations.Moreover, the SVM model has better generalization ability than other ML models [29].It is also highly robust to outliers [30].Therefore, it is expected that the SVM can provide a better estimation of ETo, which is highly complex and contains a large number of outliers.ETo is one of the complex and vital hydrological variables, so this way of simulation will improve the estimation accuracy of ETo and will help in maintaining the agricultural water resources management operation for controlling the increasing water stress in agriculture caused by global ecological fluctuations.sites was portioned into two phases: (i) a calibration phase that includes 60% data from 2009-2013, and (ii) a validation phase that contains 40% data from 2014-2016 for evaluation of machine learning against empirical models.Likewise, Table 1 summarizes the information about the geographical coordinates and descriptive statistics, i.e., minimum, maximum, mean, standard deviation, skewness, and kurtosis, of both stations from 2009 to 2016.It was noted from Table 1 that the maximum ETo variation was of 6.76 mm/month at Nagina and 7.68 mm/month at Pantnagar.
Table 2. Formulas of empirical models used at study sites.

Penman-Monteith Model
The present study utilized the Penman-Monteith (PM) model given by the Food and Agricultural Organization, with No. 56 designated as FAO-56 PM to compute the monthly ETo values at both study sites and written as [6]: where   = reference evapotranspiration in mm/month, ∆ = slope of saturation vapor pressure in kPa/°C,   = net radiation in MJ/m 2 /month,  = soil heat flux density in MJ/m 2 /month,  = psychrometric constant in kPa/°C, and   and   = saturation and actual vapor pressures in kPa.The computed time-series values of monthly ETo by the FAO-56 PM model were considered as reference data to appraise the performance of the empirical models (i.e., V-1 to V-3) and ML models (i.e., SVM, M5P, and RF).

Support Vector Machine
Over time, for optimizing the nonlinear problems, the ML models, including the support vector machine (SVM), have been utilized in numerous fields such as for predicting the penetration rate of tunnel-boring machines [33], solar radiation prediction [34], streamflow forecasting [35], landslide hazard modelling [36][37][38], seawater level simulation [39], forecasting electric load [40], and infiltration simulation [41,42].The SVM approach was recommended by Vapnik [43] and derived from statistical learning theory to solve classification and regression problems [44].Figure 2 displays the typical assembly of the SVM model.The SVM technique applied the SRM (structural risk minimization) principle [45].The SVM model utilized a nonlinear mapping function (()) to project the calibration (or training) data points into a high-dimensional feature space, and the following linear regression function is obtained in the feature space [45]: where  = output of SVM,  = input of SVM ( 1 ,  2 , … ,   ), () = loss function,  = weight vector of high-dimensional feature space, and  = constant.Following the principle of SRM, accepting the ε-insensitive loss function, the minimal  is updated for resolving the convex optimization problem as follows [29,45]: where  = penalty factor,  i , and  i * = slack variables, and ε = tube size insensitive constant (Equation ( 3)).Next, the Lagrangian multiplier is used to resolve the dual convex optimization problem in Equation ( 3), and the following solution is obtained: where   ,   * ,   , and   * = Lagrangian multipliers, which satisfy the non-negative constraints.The Lagrangian function () minimizes , ,   ,   * and maximizes   ,   * ,   , and   * according to the Karush-Kuhn-Tucker condition, and finally the regression function of SVM can be obtained as: ) (  ,   ) +  (5) where (  ,   ) = kernel function (KF), i.e., (  ,   ) = ((  ) • (  ).The choice of an appropriate kernel function improves the performance of the SVM model.A variety of KF are available but the most reliable and efficient is the RBF (radial basis function) [37,46].The RBF is expressed as [40]: where   and   = input space vectors, and  = kernel parameter.The  and  are the two most significant factors, which influence the accuracy of the SVM model.In the present study, both factors were optimized through the hit-and-trail procedure (C = 2, and  = 0.1) for predicting monthly ETo at two study sites.Further exhaustive background about the SVM can be gained from Vapnik [43], and Smola and Schölkopf [47].

M5P Tree
The M5P tree is a data-mining technique, projected by Quinlan [48].The association among output (dependent)-input (independent) variables is established based on a binary decision tree having a linear regression function at the leaf (terminal nodes).The divideand-conquer approach is applied to produce the tree-based models [49].Figure 3 displays the well-organized topology of the M5P tree model.The construction of a decision tree involves two stages: step-1: splitting the data into subgroups to create the decision tree by utilizing the principle of the standard deviation (std) and reducing the model training error at node [50], and step-2: pruning of the overfitted tree (sample) and swapping the subtrees with linear regression functions [49].Finally, the SDR (standard deviation reduction) is computed as [49,50]: where  defines a group of samples that grasps the nodes and   signifies the subgroup of samples that have the jth consequence of the latent set.In recent times, researchers have explored the successful application of the M5P model in the simulation of several hydrological processes like drought forecasting [50], infiltration simulation [51], river discharge forecasting [52,53], reference evapotranspiration estimation [49,54], stage-discharge forecasting [55], and groundwater level prediction [56].For comprehensive information about the M5P tree, readers refer to Quinlan [48].

Random Forest
The random forest (RF) algorithm was designed by Breiman [57] for solving highdimension classification and regression problems.Recently, the RF model received popularity in diverse fields of sciences such as, for instance, infiltration rate prediction [51], land use/land cover classification [58], and soil temperature estimation [59].Figure 4 illustrates the hierarchical network of the RF classifier.The construction of the RF model comprises two steps: (i) an ensemble of decision trees (or classifiers) used to build the "RF" through supervised learning, and (ii) making predictions of each decision tree formed in the first step.The RF algorithm is comparatively insensitive to features of the training set and can achieve high prediction accuracy [57].In the present study, the RF model was built by using a trial-and-error process in WEKA 3.9 software for the prediction of monthly ETo at both study locations.

Model Formulation and Statistical Indicators
Different combinations of four climatic variables, namely T, RH, u, and Rs, were used for the estimation of monthly ETo at two locations in the present research.Three combinations of four inputs were formulated based on Valiantzas' [31,32] concept and presented in Table 3.All inputs are used for C-1, three inputs for C-2, and two inputs for C-3.All three input combinations were used to train and test the ML models.Afterward, five statistical indicators, i.e., MAE (mean absolute error), RMSE (rootmean-square error), EC (efficiency coefficient), CC (correlation coefficient), and WI (Willmott index), were utilized to evaluate the predictive efficacy of the empirical (i.e., V-1 to V-3) and ML (i.e., SVM, M5P, RF) models used in the present study.Also, the graphical inspection includes temporal variation graphs, scatter plots, and Taylor diagrams that were used to make a clear interpretation of results yielded by the empirical and ML models.Table 4 shows the formulas of MAE, RMSE, EC, CC, and WI along with their range.

Performance Evaluation using Graphical Inspection
The graphical inspection was another goodness-of-fit criterion for evaluating the relative performance of the ML and empirical models during the validation phases at both stations.Figures 7a-c and 8a-c illustrates the temporal variation and scatter plots of observed versus estimated monthly ETo values by the SVM, M5P, RF, and Valiantzas models equivalent to C-1, C-2, and C-3 input combinations at the Nagina and Pantnagar sites, respectively, during the validation phase.In these figures, the outputs of the four models were fitted with a 1:1 line (best-fit line) with relative error bands of ±10%, and the coefficient of determination (R 2 ) between the observed and model outputs was also presented on the plots.If the data are concentrated or close to the 1:1 line (black line) within ±10% relative error bands, this indicates better performance of a model.These figures clearly show the higher performance of the SVM model compared to other models during the validation phase on both stations.In addition, the R 2 value was found highest in the SVM-1 model (0.995) for Nagina (Figure 7a), and 0.999 for Pantnagar (Figure 8a) in the validation stage, compared to other ML and empirical models.Overall, the SVM model can be considered optimal in estimating monthly ETo in terms of the results presented in Figures 7a-c and 8a-c.The performance of the ML and empirical models was also evaluated using the Taylor diagram [66].The obtained result during the validation period is presented in Figures 9a-c and 10a-c for Nagina and Pantnagar, respectively.The red circle on the x-axis of the Taylor diagram represents the observed monthly ETo.A model is considered better if it is near the observed point.Taylor's diagram compares three statistics (i.e., RMSE, Std, and CC) together in a graphical way and, therefore, provides a reliable assessment of the relative performance of different models.The Taylor diagram of the models during validation showed a much better performance of SVM compared to other models at the Nagina (Figure 9a-c) and Pantnagar (Figure 10a-c) stations.In addition, the SVM-predicted monthly ETo was found better-correlated with the observed monthly ETo with less RMSE compared to other models on both stations during the validation phase.Likewise, the Std of SVM-predicted ETo was found much closer to observed ETo in comparison to other models on both stations during the validation.Therefore, SVM can be ranked as the best model in terms of the results presented in the Taylor diagram followed by the M5P, RF, and Valiantzas models at both study sites.

Discussion
Evapotranspiration is a complex hydrological process that depends on the integrated effect of several climatic variables [69].It also governs the soil moisture, surface runoff, plant growth, and groundwater recharge for optimizing the available water resources [70].Furthermore, it determines the processes responsible for land-atmosphere interaction or formation of the geographical environment, and weather and climate change through ground heat and moisture balance, and water balance and surface heat balance studies [70][71][72].Similarly, Seong et al. [73] projected the implications of different potential evapotranspiration (PET) methods on streamflow under climate change in the Susquehanna River basin of the northeastern United States.They found that the streamflow projections are sensitive to the selection of the PET methods.So, the formulation of a reliable and robust model of ETo estimation is necessary for maintaining water resources and agricultural operations on farmland under a changing climate.The ML models can handle this issue very well.In this study, three ML models such as SVM, M5P, and RF were developed for monthly ETo estimation on two sites (Nagina and Pantnagar) and their outcomes were compared with empirical models.The appraisal of results shows the better feasibility of the SVM over other models at both sites.Similarly, Kaya et al. [74] estimated daily ETo in the Kosice City area of Slovakia by employing three ML models, i.e., MLP (multilayer perceptron), SVR (support vector regression), and MLR (multi-linear regression).The daily data of wind speed, relative humidity, air temperature, and solar radiation were supplied as input to these models.The performance of the ML models was evaluated against the empirical models (Hargreaves-Samani, Ritchie, & Turc), and it was found that the ML-based models provide better results than the empirical models.Kisi et al. [75] hybridized the M5 model tree with a radial basis function (RM5Tree) for estimating daily ETo at three stations (Antalya, Adana, and Isparta) in Turkey using the daily record of wind speed, relative humidity, air temperature, and solar radiation.The estimates of the RM5Tree model were compared with M5Tree, MLP, RSM (response surface method), and RBFNN (radial basis function neural network).Overall, they found the RM5Tree model provides more optimal results than the other models.
Furthermore, the findings of this research were equated to other studies conducted on ETo estimation by exploiting the ML techniques, for instance [1,9,22,76,77].Tikhamarine et al. [11] optimized the SVR model with the WOA, MVO (multi-verse optimizer), and ALO (ant-lion optimizer) algorithms to predict the monthly ETo at the Algiers and Tlemcen weather stations located in north Algeria.They found better performance of the SVR-WOA model with WI = 0.9987, 0.9997, CC = 0.9975, 0.9995, EC = 0.9949, 0.9989, RMSE = 0.0808, 0.0617 mm/month, and MAE = 0.0658, 0.0489 for the Algiers and Tlemcen sites, respectively.Gonzalez del Cerro et al. [78] compared the predictive performance of the ANFIS against the radiation and temperature-based empirical models for estimating the daily ETo in Tamil Nadu and the Coimbatore provinces of India.Results reveal that the ANFIS-based model (MAE = 0.0008 mm/day, WI = 0.9999, and CC = 0.9999) with all climatic data, i.e., mean air temperature, relative humidity, wind speed, and solar radiation produce better estimates than the empirical models.Ahmadi et al. [79] estimated monthly ETo on six stations located in Iran by exploiting three ML models, namely the SVR, GEP (gene expression programming), SVR-IWD (intelligent water drops) against the Priestley-Taylor, and H-S Hargreaves-Samani models.A comparison of results shows that the SVR-IWD model outperformed the other models at all stations.
To this end, the aforementioned studies also recommend the effectiveness of machine learning models over the empirical models in predicting monthly ETo at both study locations.

Conclusions
The effectiveness of three machine learning models, such as the support vector machine (SVM), M5P tree (M5P), and random forest (RF), was investigated in predicting monthly ETo on the Nagina and Pantnagar stations from 2009 to 2016 in the present study.From the available 8-year climatic data (2009-2016) at both stations, a total of three combinations of different inputs were established to calibrate (train) and validate (test) the ML models over the empirical models (i.e., Valiantzas-1, Valiantzas-2, Valiantzas-3) based on statistical indicators and graphical inspection.The results of the evaluation demonstrate that the SVM models with the full set of climatic data, i.e., T, RH, u, and Rs, performed superior to the M5P, RF, and Valiantzas models during the validation period at both locations under this study.In addition, the predictive accuracy of the SVM-1 to SVM-3 models with respect to RMSE improved 32.9% to 59.1%, 4.3% to 60.1%, and 23.7% to 70.4%, for M5P-1 to M5P-3, RF-1 RF-3, and V-1 to V-3, respectively, on Nagina and 59.9% to 66.1%, 16.7% to 48.9%, and 19.6% to 47.6% for M5P-1 to M5P-3, RF-1 RF-3, and V-1 to V-3, respectively, on Pantnagar.This percentage analysis also reveals the supremacy of the SVM model in predicting monthly ETo at both sites under consideration.Furthermore, the performance of the empirical models was recorded as poor at both sites in comparison to the ML models.Overall, the findings of this research show that the ML models (i.e., SVM) had better efficacy and will support the irrigation engineers, agriculturists, and

Figure 1
Figure 1 demonstrates the location map of the Nagina and Pantnagar stations positioned in Uttar Pradesh and Uttarakhand States of India.The monthly climatic data of mean air temperature (T, °C), relative humidity (RH, %), wind speed (u, m/s), and solar radiation (Rs, MJ/m 2 /month) of Nagina and Pantnagar from 2009 to 2016 (8-year) were collected from the Rice Research Station of Bijnor district in Uttar Pradesh State (India), and the CRC (Crop Research Centre, Pantnagar, India) of the G.B. Pant University of Agriculture and Technology, Uttarakhand State.The 8-year monthly climatic data of both sites was portioned into two phases: (i) a calibration phase that includes 60% data from 2009-2013, and (ii) a validation phase that contains 40% data from 2014-2016 for evaluation of machine learning against empirical models.Likewise,Table 1 summarizes the information about the geographical coordinates and descriptive statistics, i.e., minimum, maximum, mean, standard deviation, skewness, and kurtosis, of both stations from 2009 to 2016.It was noted from Table 1 that the maximum ETo variation was of 6.76 mm/month at Nagina and 7.68 mm/month at Pantnagar.

Figure 1 .
Figure 1.Location map of study sites.

Figure 2 .
Figure 2. The architecture of the SVM model.

Figure 4 .
Figure 4. Typical structure of the RF model.

Figure 5 .
Figure 5. Heatmap of statistical indicators values produced by ML and empirical models corresponding to C-1 to C-3 input combinations in the validation phase at the Nagina station.

Figure 6 .
Figure 6.Heatmap of statistical indicators values produced by ML and empirical models corresponding to C-1 to C-3 input combinations in the validation phase at the Pantnagar station.

Figure 7 .
Figure 7.Comparison of observed (FAO-56 PM) against estimated ETo values by the ML and Valiantzas models corresponding to (a) C-1, (b) C-2, and (c) C-3 input combinations during the validation stage at Nagina station.

Figure 8 .
Figure 8.Comparison of observed (FAO-56 PM) against predicted ETo values by the ML and Valiantzas models corresponding to (a) C-1, (b) C-2, and (c) C-3 input combinations during the validation stage at Pantnagar station.

Figure 9 .
Figure 9. Taylor's diagram of ML and empirical models corresponding to (a) C-1, (b) C-2, and (c) C-3 input combinations during the validation stage at Nagina station.

Figure 10 .
Figure 10.Taylor's diagram of ML and empirical models corresponding to (a) C-1, (b) C-2, and (c) C-3 input combinations during the validation stage at Pantnagar station.

Table 3 .
Different input combinations for the formulation of ML models at study sites.

Table 4 .
Formulas of different performance indicators.
Note:   , , and   , = estimated and observed monthly reference evapotranspiration values at an ith time step, N = number of observations.   ̅̅̅̅̅̅̅̅ , and    ̅̅̅̅̅̅̅̅ = mean of observed and predicted monthly reference evapotranspiration.