# The Superiority of Data-Driven Techniques for Estimation of Daily Pan Evaporation

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{*}

## Abstract

**:**

_{pan}) was evaluated based on different input parameters: maximum and minimum temperatures, relative humidity, wind speed, and bright sunshine hours. The techniques used for estimating E

_{pan}were the artificial neural network (ANN), wavelet-based ANN (WANN), radial function-based support vector machine (SVM-RF), linear function-based SVM (SVM-LF), and multi-linear regression (MLR) models. The proposed models were trained and tested in three different scenarios (Scenario 1, Scenario 2, and Scenario 3) utilizing different percentages of data points. Scenario 1 includes 60%: 40%, Scenario 2 includes 70%: 30%, and Scenario 3 includes 80%: 20% accounting for the training and testing dataset, respectively. The various statistical tools such as Pearson’s correlation coefficient (PCC), root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), and Willmott Index (WI) were used to evaluate the performance of the models. The graphical representation, such as a line diagram, scatter plot, and the Taylor diagram, were also used to evaluate the proposed model’s performance. The model results showed that the SVM-RF model’s performance is superior to other proposed models in all three scenarios. The most accurate values of PCC, RMSE, NSE, and WI were found to be 0.607, 1.349, 0.183, and 0.749, respectively, for the SVM-RF model during Scenario 1 (60%: 40% training: testing) among all scenarios. This showed that with an increase in the sample set for training, the testing data would show a less accurate modeled result. Thus, the evolved models produce comparatively better outcomes and foster decision-making for water managers and planners.

## 1. Introduction

_{m}) using direct measurement. As a result, in the hydrological field, the introduction of robust and reliable intelligent models is necessary for precise estimation [9,10,11,12,13,14].

_{pan}values, as reported by [15,16,17,18]. Since evaporation is a non-linear, stochastic, and complex operation, a reliable formula to represent all the physical processes involved is difficult to obtain [19]. In recent years, most researchers have commonly acknowledged the use of artificial intelligence techniques, such as artificial neural networks (ANNs), adaptive neuro-fuzzy inference method (ANFIS), and genetic programming (G.P.) in hydrological parameter estimation [15,20,21,22]. In estimating E

_{pan}, Sudheer et al. [23] used an ANN. They found that the ANN worked better than the other traditional approach. For modeling western Turkey’s daily pan evaporation, Keskin et al. [24] used a fuzzy approach. To estimate regular E

_{pan}, Keskin and Terzi [25] developed multi-layer perceptron (MLP) models. They found that the ANN model showed significantly better performance than the traditional system. Tan et al. [26] applied the ANN methodology to model hourly and daily open water evaporation rates. In regular E

_{pan}modeling, Kisi and Çobaner [27] used three distinct ANN methods, namely, the MLP, radial base neural network (RBNN), and generalized regression neural network (GRNN). They found that the MLP and RBNN performed much better than GRNN. In a hot and dry climate, Piri et al. [28] have applied the ANN model to estimate daily E

_{pan}. Evaporation estimation methods discussed by Moghaddamnia et al. [19] were implemented based on ANN and ANFIS. The ANN and ANFIS techniques’ findings were considered superior to those of the analytical formulas. The fuzzy sets and ANFIS were used for regular modeling of E

_{pan}by Keskin et al. [29] and found that the ANFIS method could be more efficiently used than fuzzy sets in modeling the evaporation process. Dogan et al. [30] used the approach of ANFIS for the calculation of evaporation of the pan from the Yuvacik Dam reservoir, Turkey. Tabari et al. [31] looked at the potential of ANN and multivariate non-linear regression techniques to model normal pan evaporation. Their findings concluded that the ANN performed better than non-linear regression. Using linear genetic programming techniques, Guven and Kişi [20] modeled regular pan evaporation by gene-expression programming (GEP), multi-layer perceptrons (MLP), radial basis neural networks (RBNN), generalized regression neural networks (GRNN), and Stephens–Stewart (SS) models. Two distinct evapotranspiration models have been used and found that the subtractive clustering (SC) model of ANFIS produces reasonable accuracy with less computational amounts than the ANFIS-GP ANN models [32].

_{pan}, researchers used ANN, WANN, radial function-based support vector machine (SVM-RF), linear function-based support vector machine (SVM-LF), and multi-linear regression (MLR) models of climatic variables.

_{pan}based on weather variables using data-driven methods. However, the estimation of E

_{pan}based on lag-time weather variables, which can be obtained easily, is not standard. After testing different acceptable combinations as input variables, the same inputs were used in artificial intelligence processes. In the proposed study, the main objective is to (1) model E

_{pan}using ANN, WANN, SVM-RF, SVM-LF, and MLR models under different scenarios and (2) to select the best-developed model and scenario in E

_{pan}estimation based on statistical metrics. The document’s format is as follows. Section 2 contains the study’s materials and methods: Section 3 gives the statistical indexes and methodological properties. The models’ applicability to evaporation prediction and the results are discussed in Section 4. The conclusion is found in Section 5.

## 2. Materials and Methods

#### 2.1. Study Area and Data Collection

_{max}and T

_{min}, °C), relative humidity (RH-1, percent) at 7 a.m. and at 2 p.m. (RH-2, percent), wind speed (WS, km/h), bright sunshine hours (SSH, h) and daily pan evaporation (E

_{Pan}, mm). For modeling pan evaporation, five years daily data set between the month 1 June to 30 September means that a total of 610 datasets have been used as input. The same is used for output [35].

#### 2.2. Statistical Analysis

_{max}and T

_{min}, °C), relative humidity (RH-1, percent) at 7 am and at 2 pm (RH-2, percent), wind speed (WS, km/h), bright sunshine hours (SSH, h) and daily pan evaporation (E

_{Pan}, mm). The statistical analysis includes mean, median, minimum, maximum, standard deviation (Std. Dev.), kurtosis, and skewness values from 2013 to 2017. The given data is moderate to highly skewed; due to this problem, there has been a considerable negative effect on model performance. The standard deviation for the datasets shows that the values that are farther from zero mean that the variability in the data is higher. Hence, the variation of data from the mean value is higher. The statistical characteristics from the kurtosis values depict the platykurtic and leptokurtic nature of the climatic parameters, where kurtosis values are less than or greater than 3.

_{Pan}at a significance level of 5%.

#### 2.3. Data-Driven Techniques Used

#### 2.3.1. Artificial Neural Network

_{ij}and W

_{jk}between the layers of neurons. The typical structure using input variables is shown in Figure 3.

#### 2.3.2. Wavelet Artificial Neural Network (WANN)

_{Pan}(mm) estimation. DWT decomposes the original input time series data of T

_{max}, T

_{min}, RH-1, RH-2, WS, and SSH into different frequencies (Figure 4), adapted from Rajaee [44].

_{max}, T

_{min}, RH-1, RH-2, WS, and SSH with ANN results in a wavelet artificial neural network (WANN) [42]. Three levels of the Haar à trous decomposition algorithm were used in this study. For the model’s training, the Levenberg–Marquardt algorithm was used. The hyperbolic tangent sigmoid transfer function was also used to measure a layer’s output from its net input.

#### 2.3.3. Support Vector Machine

**w**(

^{T}Φ**x**) + b

_{i}**w**is the weight vector, b is the bias, and

**Φ**(

**x**) is the high dimensional feature space, linearly mapped from the input space x. Equation (3) can be transformed into higher dimensions and gives final expression as:

_{i}- Linear kernel function (LF): the most basic form of kernel function is written as:$$K\left({x}_{i},{x}_{j}\right)=\left({x}_{i},{x}_{j}\right)$$
- Radial basis function (RBF): a mapping of RBF is identically represented as Gaussian bell shapes:$$K\left({x}_{i},{x}_{j}\right)=\mathrm{exp}\left(-\gamma {{\displaystyle |}\left|{x}_{i}-{x}_{j}\right|{\displaystyle |}}^{2}\right)$$

#### 2.3.4. Multiple Linear Regression (MLR)

_{1}, x

_{2}, …, x

_{n}are independent variables, c

_{1}, c

_{2}, …, c

_{n}are regression coefficients, and c

_{0}is intercepted. These values are the local behavior calculated using the least square rule or other regression [27].

#### 2.4. Modeling Methodology

_{Pan}) was estimated based on different input climatic variables (T

_{max}, T

_{min}, RH-1, RH-2, W.S., and S.S.H.). The five different techniques used for estimation were the artificial neural network (ANN), wavelet-based artificial neural network (WANN), radial function-based support vector machine (SVM-RF), linear function-based support vector machine (SVM-LF), and multi-linear regression (MLR) models. The climatic parameters were collected from 2013 to 2017 and split into three different scenarios, based on the percentage of training and testing datasets for model development (Table 3).

#### 2.5. Performance Evaluation Criteria

## 3. Results

#### 3.1. Quantitative and Qualitative Evaluation of Results

#### 3.2. Comparison of Training and Testing Datasets for Scenario 1

_{pan}most efficiently of all the machine learning algorithms developed for training.

^{2}) coefficients was obtained for the SVM-RF model. Thus, it can be suggested that SVM-RF has modeled the E

_{pan}most efficiently among all the machine learning algorithms developed for testing.

#### 3.3. Comparison of Training and Testing Datasets for Scenario 2

_{pan}most efficiently among all the machine-learning algorithms developed for training.

^{2}) was obtained for SVM-RF models of 0.3221. Thus, it can be shown that SVM-RF has modeled the E

_{pan}most efficiently among all the machine learning algorithms developed for testing.

#### 3.4. Comparison of Training and Testing Datasets for Scenario 3

_{pan}most efficiently among all the machine learning algorithms developed for training.

^{2}) was obtained for SVM-RF models of 0.2791. Thus, it can be seen that SVM-RF has modeled the daily E

_{pan}most efficiently among all the machine learning algorithms developed for testing.

_{pan}, can be modeled more accurately than ANN and WANN.

_{pan,}which are tending more toward observed point values at abscissa. The performance-based correlation coefficient, standard deviation, and root mean square difference are also superior compared to others. Therefore, the SVM-RF model with T

_{max}, T

_{min}, RH-1, RH-2, WS, and SSH climate variables can be used for daily E

_{pan}estimation at the Pusa station.

## 4. Discussion

^{2}of 0.717 and an RMSE of 1.11 mm independent evaluation data set, which correlates with our outcomes. As reported by Keskin and Terzi [25], the R

^{2}values of the ANN 3, 6, 1, ANN 6, 2, 1, and ANN 7, 2, 1 model equaling 0.770, 0.787, and 0.788 for modeling E

_{pan}are also acceptable and agree with our results. These developed models produced a more acceptable outcome than Kim et al. [53]. The latter stated that the ANN and MLR generated R

^{2}values ranging from 0.69 to 0.74 and from 0.61 to 0.64. The RMSE for these models varied from 1.38 to 1.48 and from 1.56 to 1.60, respectively. However, all developed models in this manuscript could not capture the variability of extreme values present in the input and output parameters at the given study location. The models’ efficiency might be improved if the extreme values are removed. This is one of the limitations of the study outlined in this paper.

## 5. Conclusions

_{pan}) estimation was evaluated using ANN, WANN, SVM-RF, SVM-LF, and MLR models. The input climatic variables for the estimation of daily E

_{pan}were: maximum and minimum temperatures (T

_{max}and T

_{min}), relative humidity (RH-1 and RH-2), wind speed (W.S.), and bright sunshine hours (SSH). The free availability of these meteorological parameters for other stations in Bihar, India, is a significant concern and limitation of this research. The proposed models were trained and tested in three separate scenarios, i.e., Scenario 1, Scenario 2, and Scenario 3, utilizing different percentages of data points. The models above were evaluated using statistical tools, namely, PCC, RMSE, NSE, and WI, through visual inspection using a line diagram, scatter plot, and Taylor diagram. Research results evidenced the SVM-RF model’s ability to estimate daily E

_{pan,}integrating all weather details like T

_{max}, T

_{min}, RH-1, RH-2, WS, and SSH The SVM-RF model’s dominance was found at Pusa station for all scenarios investigated. It is also clear that, with an increase in the sample set for training, the testing data will show a less accurate modeled result. Since the Pusa dataset has many extreme values, the developed model could not capture extreme values very efficiently; this is one of the limitations of this paper. Overall, the current research outcome showed the SVM-RF model’s viability as a newly established data-intelligent method to simulate pan evaporation in the Indian area. It can be extended to many water resource engineering applications. It is also recommended that SVM-RF models can be applied under the same climatic conditions and the availability of the same meteorological parameters.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Alizadeh, M.J.; Kavianpour, M.R.; Kisi, O.; Nourani, V. A new approach for simulating and forecasting the rainfall-runoff process within the next two months. J. Hydrol.
**2017**, 548, 588–597. [Google Scholar] [CrossRef] - Adnan, R.M.; Liang, Z.; Parmar, K.S.; Soni, K.; Kisi, O. Modeling monthly streamflow in mountainous basin by MARS, GMDH-NN and DENFIS using hydroclimatic data. Neural Comput. Appl.
**2021**, 33, 2853–2871. [Google Scholar] [CrossRef] - Mbangiwa, N.C.; Savage, M.J.; Mabhaudhi, T. Modelling and measurement of water productivity and total evaporation in a dryland soybean crop. Agric. For. Meteorol.
**2019**, 266–267, 65–72. [Google Scholar] [CrossRef] - Sayl, K.N.; Muhammad, N.S.; Yaseen, Z.M.; El-shafie, A. Estimation the Physical Variables of Rainwater Harvesting System Using Integrated GIS-Based Remote Sensing Approach. Water Resour. Manag.
**2016**, 30, 3299–3313. [Google Scholar] [CrossRef] - Sanikhani, H.; Kisi, O.; Maroufpoor, E.; Yaseen, Z.M. Temperature-based modeling of reference evapotranspiration using several artificial intelligence models: Application of different modeling scenarios. Theor. Appl. Climatol.
**2019**, 135, 449–462. [Google Scholar] [CrossRef] - Rajaee, T.; Nourani, V.; Zounemat-Kermani, M.; Kisi, O. River suspended sediment load prediction: Application of ANN and wavelet conjunction model. J. Hydrol. Eng.
**2011**, 16, 613–627. [Google Scholar] [CrossRef] - Aytek, A. Co-active neurofuzzy inference system for evapotranspiration modeling. Soft Comput.
**2008**, 13, 691. [Google Scholar] [CrossRef] - Wang, K.; Liu, X.; Tian, W.; Li, Y.; Liang, K.; Liu, C.; Li, Y.; Yang, X. Pan coefficient sensitivity to environment variables across China. J. Hydrol.
**2019**, 572, 582–591. [Google Scholar] [CrossRef] - Adnan, R.M.; Liang, Z.; Heddam, S.; Zounemat-Kermani, M.; Kisi, O.; Li, B. Least square support vector machine and multivariate adaptive regression splines for streamflow prediction in mountainous basin using hydro-meteorological data as inputs. J. Hydrol.
**2020**, 586, 124371. [Google Scholar] [CrossRef] - Snyder, R.L. Equation for Evaporation Pan to Evapotranspiration Conversions. J. Irrig. Drain. Eng.
**1992**, 118, 977–980. [Google Scholar] [CrossRef] - Adnan, R.M.; Liang, Z.; Trajkovic, S.; Zounemat-Kermani, M.; Li, B.; Kisi, O. Daily streamflow prediction using optimally pruned extreme learning machine. J. Hydrol.
**2019**, 577, 123981. [Google Scholar] [CrossRef] - Yuan, X.; Chen, C.; Lei, X.; Yuan, Y.; Muhammad Adnan, R. Monthly runoff forecasting based on LSTM–ALO model. Stoch. Environ. Res. Risk Assess.
**2018**, 32, 2199–2212. [Google Scholar] [CrossRef] - Zerouali, B.; Al-Ansari, N.; Chettih, M.; Mohamed, M.; Abda, Z.; Santos, C.A.G.; Zerouali, B.; Elbeltagi, A. An Enhanced Innovative Triangular Trend Analysis of Rainfall Based on a Spectral Approach. Water
**2021**, 13, 727. [Google Scholar] [CrossRef] - Malik, A.; Rai, P.; Heddam, S.; Kisi, O.; Sharafati, A.; Salih, S.Q.; Al-Ansari, N.; Yaseen, Z.M. Pan Evaporation Estimation in Uttarakhand and Uttar Pradesh States, India: Validity of an Integrative Data Intelligence Model. Atmosphere
**2020**, 11, 553. [Google Scholar] [CrossRef] - Cobaner, M. Evapotranspiration estimation by two different neuro-fuzzy inference systems. J. Hydrol.
**2011**, 398, 292–302. [Google Scholar] [CrossRef] - Muhammad Adnan, R.; Chen, Z.; Yuan, X.; Kisi, O.; El-Shafie, A.; Kuriqi, A.; Ikram, M. Reference Evapotranspiration Modeling Using New Heuristic Methods. Entropy
**2020**, 22, 547. [Google Scholar] [CrossRef] - Alizamir, M.; Kisi, O.; Muhammad Adnan, R.; Kuriqi, A. Modelling reference evapotranspiration by combining neuro-fuzzy and evolutionary strategies. Acta Geophys.
**2020**, 68, 1113–1126. [Google Scholar] [CrossRef] - Vallet-Coulomb, C.; Legesse, D.; Gasse, F.; Travi, Y.; Chernet, T. Lake evaporation estimates in tropical Africa (Lake Ziway, Ethiopia). J. Hydrol.
**2001**, 245, 1–18. [Google Scholar] [CrossRef] - Moghaddamnia, A.; Ghafari Gousheh, M.; Piri, J.; Amin, S.; Han, D. Evaporation estimation using artificial neural networks and adaptive neuro-fuzzy inference system techniques. Adv. Water Resour.
**2009**, 32, 88–97. [Google Scholar] [CrossRef] - Guven, A.; Kişi, Ö. Daily pan evaporation modeling using linear genetic programming technique. Irrig. Sci.
**2011**, 29, 135–145. [Google Scholar] [CrossRef] - Rahimi Khoob, A. Artificial neural network estimation of reference evapotranspiration from pan evaporation in a semi-arid environment. Irrig. Sci.
**2008**, 27, 35–39. [Google Scholar] [CrossRef] - Trajkovic, S. Testing hourly reference evapotranspiration approaches using lysimeter measurements in a semiarid climate. Hydrol. Res.
**2009**, 41, 38–49. [Google Scholar] [CrossRef] - Sudheer, K.P.; Gosain, A.K.; Mohana Rangan, D.; Saheb, S.M. Modelling evaporation using an artificial neural network algorithm. Hydrol. Process.
**2002**, 16, 3189–3202. [Google Scholar] [CrossRef] - Keskin, M.E.; Terzi, Ö.; Taylan, D. Fuzzy logic model approaches to daily pan evaporation estimation in western Turkey/Estimation de l’évaporation journalière du bac dans l’Ouest de la Turquie par des modèles à base de logique floue. Hydrol. Sci. J.
**2004**, 49, 1010. [Google Scholar] [CrossRef] - Keskin, M.E.; Terzi, Ö. Artificial Neural Network Models of Daily Pan Evaporation. J. Hydrol. Eng.
**2006**, 11, 65–70. [Google Scholar] [CrossRef] - Tan, S.B.K.; Shuy, E.B.; Chua, L.H.C. Modelling hourly and daily open-water evaporation rates in areas with an equatorial climate. Hydrol. Process. Int. J.
**2007**, 21, 486–499. [Google Scholar] [CrossRef] - Kisi, Ö.; Çobaner, M. Modeling River Stage-Discharge Relationships Using Different Neural Network Computing Techniques. CLEAN Soil Air Water
**2009**, 37, 160–169. [Google Scholar] [CrossRef] - Piri, J.; Amin, S.; Moghaddamnia, A.; Keshavarz, A.; Han, D.; Remesan, R. Daily Pan Evaporation Modeling in a Hot and Dry Climate. J. Hydrol. Eng.
**2009**, 14, 803–811. [Google Scholar] [CrossRef] - Keskin, M.E.; Terzi, Ö.; Taylan, D. Estimating daily pan evaporation using adaptive neural-based fuzzy inference system. Theor. Appl. Climatol.
**2009**, 98, 79–87. [Google Scholar] [CrossRef] - Dogan, E.; Gumrukcuoglu, M.; Sandalci, M.; Opan, M. Modelling of evaporation from the reservoir of Yuvacik dam using adaptive neuro-fuzzy inference systems. Eng. Appl. Artif. Intell.
**2010**, 23, 961–967. [Google Scholar] [CrossRef] - Tabari, H.; Marofi, S.; Sabziparvar, A.-A. Estimation of daily pan evaporation using artificial neural network and multivariate non-linear regression. Irrig. Sci.
**2010**, 28, 399–406. [Google Scholar] [CrossRef] - Chu, H.-J.; Chang, L.-C. Application of Optimal Control and Fuzzy Theory for Dynamic Groundwater Remediation Design. Water Resour. Manag.
**2009**, 23, 647–660. [Google Scholar] [CrossRef] - Vapnik, V. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995. [Google Scholar]
- Kim, S.; Shiri, J.; Kisi, O. Pan Evaporation Modeling Using Neural Computing Approach for Different Climatic Zones. Water Resour. Manag.
**2012**, 26, 3231–3249. [Google Scholar] [CrossRef] - Tikhamarine, Y.; Malik, A.; Pandey, K.; Sammen, S.S.; Souag-Gamane, D.; Heddam, S.; Kisi, O. Monthly evapotranspiration estimation using optimal climatic parameters: Efficacy of hybrid support vector regression integrated with whale optimization algorithm. Environ. Monit. Assess.
**2020**, 192, 696. [Google Scholar] [CrossRef] - Elbeltagi, A.; Deng, J.; Wang, K.; Hong, Y. Crop Water footprint estimation and modeling using an artificial neural network approach in the Nile Delta, Egypt. Agric. Water Manag.
**2020**, 235, 106080. [Google Scholar] [CrossRef] - Elbeltagi, A.; Aslam, M.R.; Malik, A.; Mehdinejadiani, B.; Srivastava, A.; Bhatia, A.S.; Deng, J. The impact of climate changes on the water footprint of wheat and maize production in the Nile Delta, Egypt. Sci. Total Environ.
**2020**, 743, 140770. [Google Scholar] [CrossRef] [PubMed] - Elbeltagi, A.; Deng, J.; Wang, K.; Malik, A.; Maroufpoor, S. Modeling long-term dynamics of crop evapotranspiration using deep learning in a semi-arid environment. Agric. Water Manag.
**2020**, 241, 106334. [Google Scholar] [CrossRef] - Elbeltagi, A.; Aslam, M.R.; Mokhtar, A.; Deb, P.; Abubakar, G.A.; Kushwaha, N.L.; Venancio, L.P.; Malik, A.; Kumar, N.; Deng, J. Spatial and temporal variability analysis of green and blue evapotranspiration of wheat in the Egyptian Nile Delta from 1997 to 2017. J. Hydrol.
**2020**, 125662. [Google Scholar] [CrossRef] - Kim, T.-W.; Valdés, J.B. Nonlinear Model for Drought Forecasting Based on a Conjunction of Wavelet Transforms and Neural Networks. J. Hydrol. Eng.
**2003**, 8, 319–328. [Google Scholar] [CrossRef] [Green Version] - Adamowski, J.; Fung Chan, H.; Prasher, S.O.; Ozga-Zielinski, B.; Sliusarieva, A. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada. Water Resour. Res.
**2012**, 48. [Google Scholar] [CrossRef] - Labat, D.; Ababou, R.; Mangin, A. Rainfall–runoff relations for karstic springs. Part II: Continuous wavelet and discrete orthogonal multiresolution analyses. J. Hydrol.
**2000**, 238, 149–178. [Google Scholar] [CrossRef] - Kişi, Ö. Daily suspended sediment estimation using neuro-wavelet models. Int. J. Earth Sci.
**2010**, 99, 1471–1482. [Google Scholar] [CrossRef] - Rajaee, T. Wavelet and ANN combination model for prediction of daily suspended sediment load in rivers. Sci. Total Environ.
**2011**, 409, 2917–2928. [Google Scholar] [CrossRef] - Adnan, R.M.; Khosravinia, P.; Karimi, B.; Kisi, O. Prediction of hydraulics performance in drain envelopes using Kmeans based multivariate adaptive regression spline. Appl. Soft Comput.
**2021**, 100, 107008. [Google Scholar] [CrossRef] - Lin, J.-Y.; Cheng, C.-T.; Chau, K.-W. Using support vector machines for long-term discharge prediction. Hydrol. Sci. J.
**2006**, 51, 599–612. [Google Scholar] [CrossRef] - Tripathi, S.; Srinivas, V.V.; Nanjundiah, R.S. Downscaling of precipitation for climate change scenarios: A support vector machine approach. J. Hydrol.
**2006**, 330, 621–640. [Google Scholar] [CrossRef] - Liu, Q.-J.; Shi, Z.-H.; Fang, N.-F.; Zhu, H.-D.; Ai, L. Modeling the daily suspended sediment concentration in a hyperconcentrated river on the Loess Plateau, China, using the Wavelet–ANN approach. Geomorphology
**2013**, 186, 181–190. [Google Scholar] [CrossRef] - Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw.
**2004**, 17, 113–126. [Google Scholar] [CrossRef] [Green Version] - Taylor, K.E. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. Atmos.
**2001**, 106, 7183–7192. [Google Scholar] [CrossRef] - Tezel, G.; Buyukyildiz, M. Monthly evaporation forecasting using artificial neural networks and support vector machines. Theor. Appl. Climatol.
**2016**, 124, 69–80. [Google Scholar] [CrossRef] - Pammar, L.; Deka, P.C. Daily pan evaporation modeling in climatically contrasting zones with hybridization of wavelet transform and support vector machines. Paddy Water Environ.
**2017**, 15, 711–722. [Google Scholar] [CrossRef] - Kim, S.; Singh, V.P.; Seo, Y. Evaluation of pan evaporation modeling with two different neural networks and weather station data. Theor. Appl. Climatol.
**2014**, 117, 1–13. [Google Scholar] [CrossRef]

**Figure 6.**Line and scatter plot between observed and predicted data at Scenario 1 for (

**a**) ANN, (

**b**) WANN (

**c**) SVM-RF, (

**d**) SVM-LF, and (

**e**) MLR for the study area.

**Figure 7.**Line and scatter plots between observed and predicted data at Scenario 2 for (

**a**) ANN, (

**b**) WANN (

**c**) SVM-RF, (

**d**) SVM-LF, and (

**e**) MLR, for the study area.

**Figure 8.**Line and scatter plot between observed and predicted data at scenario 3 for (

**a**) ANN, (

**b**) WANN (

**c**) SVM-RF, (

**d**) SVM-LF, and (

**e**) MLR, for the study area.

**Figure 9.**Taylor diagrams of ANN, WANN, SVM-RF, SVM-LF, and MLR corresponding to (

**a**) Scenario 1, (

**b**) Scenario 2, (

**c**) Scenario 3 during the testing period at the study site.

Statistical Parameters | Mean | Median | Minimum | Maximum | Std. Dev. | Kurtosis | Skewness |
---|---|---|---|---|---|---|---|

T_{max} (°C) | 33.58 | 33.80 | 23.40 | 42.70 | 2.43 | 1.30 | −0.11 |

T_{min} (°C) | 25.87 | 26.00 | 21.40 | 29.60 | 1.31 | 0.37 | −0.51 |

RH-1 (%) | 88.42 | 89.00 | 55.00 | 98.00 | 5.39 | 4.27 | −1.33 |

RH-2 (%) | 68.83 | 68.00 | 23.00 | 97.00 | 12.17 | 0.65 | −0.22 |

WS (km/h) | 6.03 | 5.70 | 1.20 | 16.70 | 2.63 | 0.82 | 0.85 |

SSH (h) | 5.36 | 5.55 | 0.00 | 12.70 | 3.50 | −1.20 | −0.02 |

E_{Pan} (mm) | 3.85 | 3.70 | 0.00 | 13.00 | 1.67 | 2.34 | 0.89 |

Climatic Variable | T_{max} | T_{min} | RH-1 | RH-2 | WS | SSH | E_{Pan} |
---|---|---|---|---|---|---|---|

T_{max} | 1.00 | ||||||

T_{min} | 0.32 | 1.00 | |||||

RH-1 | −0.43 | −0.29 | 1.00 | ||||

RH-2 | −0.51 | −0.15 | 0.48 | 1.00 | |||

WS | −0.07 | 0.02 | −0.19 | 0.00 | 1.00 | ||

SSH | 0.68 | 0.28 | −0.42 | −0.51 | 0.05 | 1.00 | |

E_{Pan} | 0.58 | 0.11 | −0.30 | −0.34 | 0.19 | 0.51 | 1.00 |

Scenarios | Training Data Length (%) | Testing Data Length (%) |
---|---|---|

Scenario 1 | 60% (2013–2015) | 40% (2016–2017) |

Scenario 2 | 70% | 30% |

Scenario 3 | 80% (2013–2016) | 20% (2017) |

**Table 4.**Results for ANN, WANN, SVM-RF, SVM-LF, and M.L.R. during the training and testing period for Scenario 1 (60–40: Training–Testing).

Model | Structure | Dataset | PCC | RMSE | NSE | WI |
---|---|---|---|---|---|---|

ANN-1 | 6-5-1 | Training | 0.832 | 0.993 | 0.685 | 0.904 |

Testing | 0.589 | 1.387 | 0.136 | 0.708 | ||

ANN-2 | 6-8-1 | Training | 0.739 | 1.254 | 0.498 | 0.840 |

Testing | 0.585 | 1.486 | 0.010 | 0.732 | ||

ANN-3 | 6-12-1 | Training | 0.769 | 1.157 | 0.573 | 0.846 |

Testing | 0.531 | 1.529 | −0.048 | 0.705 | ||

WANN-1 | 24-6-1 | Training | 0.773 | 1.123 | 0.597 | 0.860 |

Testing | 0.505 | 1.394 | 0.129 | 0.676 | ||

WANN-2 | 24-11-1 | Training | 0.694 | 1.286 | 0.472 | 0.813 |

Testing | 0.428 | 1.491 | 0.003 | 0.614 | ||

WANN-3 | 24-16-1 | Training | 0.634 | 1.502 | 0.281 | 0.766 |

Testing | 0.477 | 1.643 | −0.211 | 0.681 | ||

SVM-RF-1 | c = 1, ε = 0.001, γ = 0.16 | Training | 0.777 | 1.122 | 0.599 | 0.856 |

Testing | 0.595 | 1.369 | 0.159 | 0.746 | ||

SVM-RF-2 | c = 1, ε = 0.01, γ = 0.16 | Training | 0.794 | 1.088 | 0.622 | 0.864 |

Testing | 0.604 | 1.344 | 0.190 | 0.749 | ||

SVM-RF-3 | c = 1, ε = 0.1, γ = 0.16 | Training | 0.857 | 0.956 | 0.708 | 0.895 |

Testing | 0.607 | 1.349 | 0.183 | 0.749 | ||

SVM-LF-1 | c = 1, ε = 0.1, γ = 0.5 | Training | 0.687 | 1.297 | 0.463 | 0.804 |

Testing | 0.592 | 1.406 | 0.113 | 0.731 | ||

SVM-LF-2 | c = 1, ε = 0.1, γ = 0.8 | Training | 0.687 | 1.297 | 0.463 | 0.804 |

Testing | 0.592 | 1.406 | 0.113 | 0.731 | ||

SVM-LF-3 | c = 1, ε = 0.1, γ = 0.16 | Training | 0.687 | 1.297 | 0.463 | 0.807 |

Testing | 0.592 | 1.406 | 0.113 | 0.731 | ||

MLR | Training | 0.695 | 1.274 | 0.483 | 0.800 | |

Testing | 0.587 | 1.345 | 0.188 | 0.725 |

**Table 5.**Results for ANN, WANN, SVM-RF, SVM-LF, and MLR during training and testing period for Scenario 2 (70–30: Training–Testing).

Model | Structure | Dataset | PCC | RMSE | NSE | WI |
---|---|---|---|---|---|---|

ANN-1 | 6-1-1 | Training | 0.760 | 1.180 | 0.577 | 0.854 |

Testing | 0.547 | 1.222 | 0.046 | 0.704 | ||

ANN-2 | 6-4-1 | Training | 0.749 | 1.209 | 0.557 | 0.842 |

Testing | 0.535 | 1.333 | −0.135 | 0.691 | ||

ANN-3 | 6-10-1 | Training | 0.716 | 1.278 | 0.504 | 0.824 |

Testing | 0.546 | 1.235 | 0.026 | 0.727 | ||

WANN-1 | 24-1-1 | Training | 0.672 | 1.344 | 0.452 | 0.781 |

Testing | 0.439 | 1.316 | −0.106 | 0.602 | ||

WANN-2 | 24-6-1 | Training | 0.725 | 1.264 | 0.515 | 0.831 |

Testing | 0.457 | 1.252 | −0.002 | 0.639 | ||

WANN-3 | 24-9-1 | Training | 0.716 | 1.281 | 0.502 | 0.802 |

Testing | 0.413 | 1.275 | −0.039 | 0.604 | ||

SVM-RF-1 | c = 1, ε = 0.001, γ = 0.16 | Training | 0.764 | 1.178 | 0.579 | 0.847 |

Testing | 0.560 | 1.285 | −0.055 | 0.704 | ||

SVM-RF-2 | c = 1, ε = 0.01, γ = 0.16 | Training | 0.765 | 1.177 | 0.579 | 0.848 |

Testing | 0.561 | 1.286 | −0.056 | 0.705 | ||

SVM-RF-3 | c = 1, ε = 0.1, γ = 0.16 | Training | 0.812 | 1.073 | 0.650 | 0.875 |

Testing | 0.568 | 1.262 | −0.018 | 0.714 | ||

SVM-LF-1 | c = 1, ε = 0.1, γ = 0.9 | Training | 0.689 | 1.326 | 0.466 | 0.805 |

Testing | 0.539 | 1.356 | −0.175 | 0.696 | ||

SVM-LF-2 | c = 1, ε = 0.01, γ = 0.16 | Training | 0.688 | 1.330 | 0.463 | 0.807 |

Testing | 0.542 | 1.360 | −0.182 | 0.700 | ||

SVM-LF-3 | c = 1, ε = 0.1, γ = 0.16 | Training | 0.689 | 1.326 | 0.466 | 0.805 |

Testing | 0.539 | 1.356 | −0.175 | 0.696 | ||

MLR | Training | 0.693 | 1.308 | 0.481 | 0.799 | |

Testing | 0.531 | 1.262 | −0.017 | 0.700 |

**Table 6.**Results for ANN, WANN, SVM-RF, SVM-LF, and M.L.R. during the training and testing period for Scenario 3 (80–20: Training–Testing).

Model | Structure | Dataset | PCC | RMSE | NSE | WI |
---|---|---|---|---|---|---|

ANN-1 | 6-1-1 | Training | 0.701 | 1.250 | 0.490 | 0.809 |

Testing | 0.512 | 1.321 | −0.152 | 0.681 | ||

ANN-2 | 6-9-1 | Training | 0.764 | 1.136 | 0.578 | 0.847 |

Testing | 0.514 | 1.260 | −0.049 | 0.695 | ||

ANN-3 | 6-13-1 | Training | 0.789 | 1.079 | 0.620 | 0.879 |

Testing | 0.520 | 1.333 | −0.172 | 0.688 | ||

WANN-1 | 24-2-1 | Training | 0.725 | 1.213 | 0.519 | 0.812 |

Testing | 0.467 | 1.447 | −0.382 | 0.608 | ||

WANN-2 | 24-7-1 | Training | 0.693 | 1.267 | 0.476 | 0.813 |

Testing | 0.369 | 1.434 | −0.357 | 0.586 | ||

WANN-3 | 24-11-1 | Training | 0.721 | 1.221 | 0.513 | 0.812 |

Testing | 0.439 | 1.334 | −0.175 | 0.603 | ||

SVM-RF-1 | c = 1, ε = 0.001, γ = 0.16 | Training | 0.768 | 1.128 | 0.584 | 0.849 |

Testing | 0.527 | 1.415 | −0.322 | 0.660 | ||

SVM-RF-2 | c = 1, ε = 0.1, γ = 0.2 | Training | 0.850 | 0.951 | 0.705 | 0.894 |

Testing | 0.526 | 1.413 | −0.318 | 0.664 | ||

SVM-RF-3 | c = 1, ε = 0.1, γ = 0.16 | Training | 0.893 | 0.858 | 0.760 | 0.913 |

Testing | 0.528 | 1.411 | −0.315 | 0.665 | ||

SVM-LF-1 | c = 1, ε = 0.1, γ = 0.3 | Training | 0.684 | 1.286 | 0.460 | 0.802 |

Testing | 0.496 | 1.453 | −0.394 | 0.658 | ||

SVM-LF-2 | c = 1, ε = 0.1, γ = 0.6 | Training | 0.684 | 1.286 | 0.460 | 0.802 |

Testing | 0.496 | 1.453 | −0.394 | 0.658 | ||

SVM-LF-3 | c = 1, ε = 0.001, γ = 0.16 | Training | 0.683 | 1.286 | 0.460 | 0.803 |

Testing | 0.490 | 1.465 | −0.417 | 0.654 | ||

MLR | Training | 0.688 | 1.269 | 0.474 | 0.795 | |

Testing | 0.506 | 1.363 | −0.227 | 0.665 |

**Table 7.**Results for best ANN, WANN, SVM-RF, and MLR during the training and testing period for all scenarios.

Scenario | Model | Dataset | PCC | RMSE | NSE | WI |
---|---|---|---|---|---|---|

1 | ANN-1 | Training | 0.832 | 0.993 | 0.685 | 0.904 |

Testing | 0.589 | 1.387 | 0.136 | 0.708 | ||

WANN-1 | Training | 0.773 | 1.123 | 0.597 | 0.860 | |

Testing | 0.505 | 1.394 | 0.129 | 0.676 | ||

SVM-RF-3 | Training | 0.857 | 0.956 | 0.708 | 0.895 | |

Testing | 0.607 | 1.349 | 0.183 | 0.749 | ||

MLR | Training | 0.695 | 1.274 | 0.483 | 0.800 | |

Testing | 0.587 | 1.345 | 0.188 | 0.725 | ||

2 | ANN-1 | Training | 0.760 | 1.180 | 0.577 | 0.854 |

Testing | 0.547 | 1.222 | 0.046 | 0.704 | ||

WANN-2 | Training | 0.725 | 1.264 | 0.515 | 0.831 | |

Testing | 0.457 | 1.252 | −0.002 | 0.639 | ||

SVM-RF-3 | Training | 0.812 | 1.073 | 0.650 | 0.875 | |

Testing | 0.568 | 1.262 | −0.018 | 0.714 | ||

MLR | Training | 0.693 | 1.308 | 0.481 | 0.799 | |

Testing | 0.531 | 1.262 | −0.017 | 0.700 | ||

3 | ANN-3 | Training | 0.789 | 1.079 | 0.620 | 0.879 |

Testing | 0.520 | 1.333 | −0.172 | 0.688 | ||

WANN-1 | Training | 0.725 | 1.213 | 0.519 | 0.812 | |

Testing | 0.467 | 1.447 | −0.382 | 0.608 | ||

SVM-RF-3 | Training | 0.893 | 0.858 | 0.760 | 0.913 | |

Testing | 0.528 | 1.411 | −0.315 | 0.665 | ||

MLR | Training | 0.688 | 1.269 | 0.474 | 0.795 | |

Testing | 0.506 | 1.363 | −0.227 | 0.665 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kumar, M.; Kumari, A.; Kumar, D.; Al-Ansari, N.; Ali, R.; Kumar, R.; Kumar, A.; Elbeltagi, A.; Kuriqi, A.
The Superiority of Data-Driven Techniques for Estimation of Daily Pan Evaporation. *Atmosphere* **2021**, *12*, 701.
https://doi.org/10.3390/atmos12060701

**AMA Style**

Kumar M, Kumari A, Kumar D, Al-Ansari N, Ali R, Kumar R, Kumar A, Elbeltagi A, Kuriqi A.
The Superiority of Data-Driven Techniques for Estimation of Daily Pan Evaporation. *Atmosphere*. 2021; 12(6):701.
https://doi.org/10.3390/atmos12060701

**Chicago/Turabian Style**

Kumar, Manish, Anuradha Kumari, Deepak Kumar, Nadhir Al-Ansari, Rawshan Ali, Raushan Kumar, Ambrish Kumar, Ahmed Elbeltagi, and Alban Kuriqi.
2021. "The Superiority of Data-Driven Techniques for Estimation of Daily Pan Evaporation" *Atmosphere* 12, no. 6: 701.
https://doi.org/10.3390/atmos12060701