# Evaluation of Machine Learning Models for Daily Reference Evapotranspiration Modeling Using Limited Meteorological Data in Eastern Inner Mongolia, North China

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

_{0}) is of the utmost importance for computing crop water requirements, agricultural water management, and irrigation scheduling design. However, due to the combination of insufficient meteorological data and uncertain inputs, the accuracy and stability of the ET

_{0}prediction model were different to varying degrees. Methods: Six machine learning models were proposed in the current study for daily ET

_{0}estimation. Information on the weather, such as the maximum and minimum air temperatures, solar radiation, relative humidity, and wind speed, during the period 1960~2019 was obtained from eighteen stations in the northeast of Inner Mongolia, China. Three input combinations were utilized to train and test the proposed models and compared with the corresponding empirical equations, including two temperature-based, three radiation-based, and two humidity-based empirical equations. To evaluate the ET

_{0}estimation models, two strategies were used: (1) in each weather station, we trained and tested the proposed machine learning model, and then compared it with the empirical equations, and (2) using the K-means algorithm, all weather stations were sorted into three groups based on their average climatic features. Then, each station tested the machine learning model trained using the other stations within the group. Three statistical indicators, namely, determination coefficient (R

^{2}), root mean square error (RMSE), and mean absolute error (MAE), were used to evaluate the performance of the models. Results: The results showed the following: (1) The temperature-based temporal convolutional neural network (TCN) model outperformed the empirical equations in the first strategy, as shown by the TCN’s R

^{2}values being 0.091, 0.050, and 0.061 higher than those of empirical equations; the RMSE of the TCN being significantly lower than that of empirical equations by 0.224, 0.135, and 0.159 mm/d; and the MAE of the TCN being significantly lower than that of empirical equations by 0.208, 0.151, and 0.097 mm/d, and compared with the temperature-based empirical equations, the TCN model markedly reduced RMSE and MAE while increasing R

^{2}in the second strategy. (2) In comparison to the radiation-based empirical equations, all machine learning models reduced RMSE and MAE, while significantly increasing R

^{2}in both strategies, particularly the TCN model. (3) In addition, in both strategies, all machine learning models, particularly the TCN model, enhanced R

^{2}and reduced RMSE and MAE significantly when compared to humidity-based empirical equations. Conclusions: When the radiation or humidity characteristics were added to the given temperature characteristics, all the proposed machine learning models could estimate ET

_{0}, and its accuracy was higher than the calibrated empirical equations external to the training study area, which makes it possible to develop an ET

_{0}estimation model for cross-station data with similar meteorological characteristics to obtain a satisfactory ET

_{0}estimation for the target station.

## 1. Introduction

_{0}) [1,2,3]. As a result of the high accuracy across a range of climatic circumstances, the United Nations Food and Agriculture Organization (FAO) has recommended the Penman–Monteith equation (FAO56-PM) as a standard approach for predicting ET

_{0}and calibrating other empirical and semiempirical models [4,5,6]. The FAO56-PM equation data requirements, however, are high, since the model requires solar radiation, the maximum and minimum air temperature, relative humidity, and wind speed. The meteorological stations measuring these features, worldwide, are scarce, and the wide application of the Penman–Monteith method has thus been severely constrained, especially in developing countries such as China [7]. Therefore, in order to estimate ET

_{0}accurately, a simpler model using fewer weather feature inputs needs to be explored.

_{0}, such as temperature-based models [8,9], radiation-based models [10,11,12], humidity-based models [13,14], mass transfer-based models [15], and pan-based methods [16]. Among these models, when all meteorological features are unavailable, the first three models are frequently used to replace the FAO56-PM equation to calculate ET

_{0}[10]. Nevertheless, these empirical models also have some defects, such as either overestimating or underestimating the ET

_{0}[14], and are suitable to estimate weekly or monthly scales but not effective for daily ET

_{0}estimation [17]. Thus, it is necessary to investigate and develop better models for forecasting ET

_{0}with a high level of precision using fewer weather features.

_{0}in order to overcome the dependency on meteorological data due to their outstanding capacity to handle nonlinear interactions between the dependent and independent variables. To forecast ET

_{0}, many machine learning and deep learning models have been proposed, including support vector machines (SVMs) [18,19,20], random forests (RFs) [5,21], the M5 model tree (M5Tree) [22,23], extreme gradient boosting (XGBoost) [24], artificial neural networks (ANNs) [4,25], extreme learning machines (ELMs) [7,26], the long short-term memory neural network (LSTM) [16,27], bidirectional LSTM (Bi-LSTM) [28], the adaptive neuro fuzzy inference system (ANFIS) [29], and multivariate adaptive regression spline (MARS) [30,31].

_{0}prediction, but the employment of deep machine learning methods, particularly the TCN and LSTM models, have been quite sparse. In addition, there has not been a thorough comparison of these deep learning models with the widely used ANN, SVM, and RF models, particularly in terms of their performance in predicting ET

_{0}with limited meteorological inputs under different climatic conditions. Additionally, almost all studies that have adopted classical machine learning models (e.g., ANN, SVM and RF) to estimate ET

_{0}have tested the accuracy and stability of these models for forecasting ET

_{0}in a single station alone, which fails to make a comparison with machine learning models. In this case, the specific objectives of this study were to (1) develop six machine learning models, namely, TCN, ANN, LSTM, K-nearest neighbors (KNN), RF, and Light Gradient-Boosting Machine (LGB), to predict ET

_{0}in North China’s Inner Mongolia Autonomous Region; (2) determine the effects of limited meteorological inputs on the accuracy of daily ET

_{0}prediction; and (3) investigate the performance of these machine learning models and empirical equations within and outside of the study area.

## 2. Materials and Methods

^{2}. With features of both a continental and monsoon climate, the northeast of the Inner Mongolia Autonomous Region is in the mild temperate zone. The average daily weather variables for the 18 stations located in the northeast of China’s Inner Mongolia Autonomous Region are summarized in Table 1. Continuous daily meteorological data during the period 1960–2019 were collected from the China Meteorological Administration in this study, including maximum temperature (T

_{max}), minimum temperature (T

_{min}), sunshine duration (SH), relative humidity (RH), wind speed at 2 m height (U

_{2}), and precipitation (P).

#### 2.1. Models for Modeling Reference Evapotranspiration

#### 2.1.1. FAO-56 Penman–Monteith Equation

_{0}data, which were chosen as the calibration and assessment goals for the proposed models. This process is reasonable and has been applied in many earlier investigations [4,32,33,34]. The following shows how the Penman–Monteith model is expressed:

^{−1}); ${\mathrm{R}}_{\mathrm{n}}$ is the net radiation (MJm

^{−2}d

^{−1}); G is the soil heat flux density (MJm

^{−2}d

^{−1}); T is the mean daily air temperature (°C); u

_{2}is the wind speed at a height of 2 m (ms

^{−1}); e

_{s}and e

_{a}are the saturation vapor pressure (kPa) and actual vapor pressure (kPa), respectively; $\u2206$ is the slope vapor pressure curve (kPa °C

^{−1}); and $\mathsf{\gamma}$ is the psychrometric constant (kPa °C

^{−1}).

#### 2.1.2. Empirical Models for Predicting Daily ET_{0}

_{0}using just the maximum and minimum air temperature. Equations (2)–(4) represent the Hargreaves and modified Hargreaves models, respectively [8,35,36].

^{−2}d

^{−1}), which is calculated using the following equation:

^{−2}min

^{−1}; ${\mathrm{d}}_{\mathrm{r}}$ is the inverse relative distance between the Earth and the sun; δ is the solar declination (rad); $\mathsf{\phi}$ is the latitude (rad); ${\mathsf{\omega}}_{\mathrm{s}}$ is the sunset hour angle (rad).

_{0}[10,11,37].

^{−2}d

^{−1}is given by

_{0}was derived using the following formula:

#### 2.1.3. Machine Learning Models for Predicting Daily ET_{0}

_{0}.

#### 2.2. Data Management and the Development of Machine Learning Models

_{0}with incomplete meteorological data under three data availability situations: temperature-based models, which used only measured data on maximum and minimum air temperature; humidity-based models, which used measured data on maximum and minimum air temperature, average temperature, and relative humidity; and radiation-based models, which used measured data on maximum and minimum air temperature, average temperature, and solar radiation. In these instances, extraterrestrial radiation, which was estimated using latitude and the day of the year, was applied to supplement the observed data [39]. Furthermore, the differences in performance between the proposed machine learning models and the relevant empirical equations were tested individually for significance.

_{max}, T

_{min}, RH, R

_{a}, R

_{s}, and ET

_{0}from 1960 to 2019 were available for this study. All weather stations were grouped using these features via the K-means algorithm, a well-known clustering technique that has the advantages of being quick and simple. K-means is one of the most commonly used data clustering algorithms, and its wide application is well-known. Given K initial centroids, the K-means algorithm aims to assign algorithms of the data points into K clusters by minimizing the distance from each vector to the centroid of its cluster. Therefore, it produced various clustering outcomes with various cluster numbers and initial centroid values. Since choosing the K value is not an easy task, the best choice of K value was determined using the silhouette coefficient. The value of the silhouette coefficient is between −1 and 1: the closer to 1, the better the cohesion and separation. Figure 2 demonstrates that the study’s best option was the use of three clusters, with a silhouette coefficient of 0.57. Table 1 displays the K-means algorithm’s output, which shows that Eergunaqi, Tulihe, Xiaoergou, and Aershan belonged to Group III; Manzhouli, Hailaer, Xinbaerhuyouqi, Xinbaerhuzuoqi, Zhalantun, and Suolun belonged to Group II; and the others belonged to Group I.

_{n}and x

_{i}represent the moralized and raw training and testing data, respectively, and x

_{max}and x

_{min}are the minimum and maximum of the training and testing data, respectively.

#### 2.3. Model Performance and Assessment

^{2}), root mean square error (RMSE), and mean absolute error (MAE)—were used to assess and compare the performance of the trained models for estimating ET

_{0}. The formulae are as follows:

_{i}is the ith observation (mm/d), P

_{i}is the predicted value of the ith model (mm/d), $\overline{O}$ is the average of the observed values O

_{i}(mm/d), $\overline{P}$ is the average of the model-predicted values P

_{i}(mm/d), and N is the number of samples.

## 3. Results

#### 3.1. Temperature-Based Models

_{0}in the first strategy are provided in Figure 3. According to Figure 3, the mean R

^{2}of the H model was higher than that of MH1 and MH2, and the RMSE and MAE were lower than those of MH1 and MH2 in the first and second groups. However, for the third group, the mean R

^{2}of MH2 was higher, and the RMSE and MAE were lower. Thus, the prediction accuracy of TCN, LSTM, LGB, ANN, RF, and KNN, based on air temperature data, was compared with H in the first and second groups, and with MH2 in the third group.

^{2}of ANN and KNN was not significantly different, but both the RMSE and MAE were lower than those of the H model in the first group. Figure 3 also illustrates that the TCN, LSTN, LGB, and RF all performed similarly better than the H model, with a significant increase in the mean R

^{2}values ranging from 0.054 to 0.091, and a significant reduction in RMSE, 0.115–0.224 mm/d, and MAE, 0.39–0.208 mm/d, respectively. In the second group, the ANN and KNN models performed as well as the H model, while the LSTM, LGB, and RF models performed better than the H model, with a decrease in MAE of 0.131, 0.126, and 0.106 mm/d, respectively. The mean R

^{2}value of the TCN was 0.920, which was 0.05 higher than that of H, and the MAE of the TCN was 0.379 mm/d, which was 0.151 mm/d lower than that of H. For the third group, from the general trend of R

^{2}, RMSE, and MAE, only the TCN model had a better performance among all machine learning models, with an R

^{2}of 0.935, RMSE of 0.433 mm/d, and MAE of 0.277 mm/d. These results indicated that, in the three groups, the temperature-based TCN model outperformed the empirical method in accuracy for predicting ET

_{0}.

^{2}, RMSE, and MAE, in the first group, compared with the H model, although the R

^{2}and RMSE of the ANN, RF, and KNN models did not differ from it, the MAE of the ANN, RF, and KNN models was decreased by 0.120, 0.106, and 0.085 mm/d, respectively. Figure 4 also illustrates that the TCN, LSTM, and LGB all performed similarly better than the H model, with a significant increase in the mean R

^{2}values ranging from 0.053 to 0.080, and a significant reduction in RMSE, 0.100–0.197 mm/d, and MAE, 0.129–0.182 mm/d, respectively. In addition, compared with LSTM and LGB, TCN slightly increased in R

^{2}and decreased in RMSE and MAE. This result indicated that TCN has better prediction accuracy than LSTM and LGB.

^{2}. The LGB, ANN, RF, and H model equations exhibited intermediate performances, and they were similar to each other. The LSTM model showed comparable estimates of ET

_{0}to the TCN model, with increases in R

^{2}of 0.041 and 0.043, decreases in RMSE of 0.124 and 0.130 mm/d, and reductions in MAE of 0.138 and 0.140 mm/d compared with the H model. This result indicates that the LSTM showed the highest performance among the six machine learning models, based on the fact that LSTM showed the highest R

^{2}and the lowest MAE and RMSE. For the third group, all machine learning models showed roughly equivalent estimates of ET

_{0}, but numerically, the TCN and LSTM exhibited increases in R

^{2}of 0.031 and 0.034, decreases in RMSE of 0.082 and 0.090 mm/d, and reductions in MAE of 0.031 and 0.030 mm/d, respectively, when compared with the MH2 model. As can be seen from the above results, in cases where the temperature-based machine learning models had the same performance, the TCN was chosen given its better overall performance in this investigation.

#### 3.2. Radiation-Based Models

^{2}of 0.839, RMSE of 0.801 mm/d, and MAE of 0.561 mm/d for the first group, and an R

^{2}of 0.880, RMSE of 0.702 mm/d and MAE of 0.461 mm/d for the second group, respectively) while the P model performed slightly better in the third group (with an R

^{2}of 0.895, RRMSE of 0.542 mm/d, and MAE of 0.352). Thus, the prediction accuracies of TCN, LSTM, LGB, ANN, RF, and KNN, based on air temperature and radiation data, were compared with R in the first group and the second group, and with P in the third group.

^{2}of the KNN and R models did not differ significantly, but the RMSE and MAE of KNN were lower than those of the R model. Figure 5 also illustrates that the TCN, LSTM, LGB, ANN, and RF all performed similarly better than the H model, with a significant increase in the mean R

^{2}values ranging from 0.043 to 0.100, and a significant reduction in RMSE, 0.166–0.303 mm/d, and MAE, 0.118–0.214 mm/d, respectively. Additionally, TCN significantly outperformed ANN and slightly outperformed LSTM, LGB, and RF in terms of R

^{2}, as well as RMSE and MAE. It is clear, then, that the TCN model showed the highest R

^{2}and lowest RMSE and MAE in the first group, as compared to other models. Similarly, the R

^{2}, RMSE, and MAE of the ANN, KNN, and R models did not differ significantly in the second group. Although the R

^{2}and RMSE of the LSTM, LGB, RF, and R models did not differ significantly, the MAE was lower in the R model, indicating that the performance of the LSTM, LGB, and RF models was slightly better than that of the R model. Furthermore, the TCN model predicted ET

_{0}with a higher R

^{2}and a lower MAE than the R model, making it better than the R model in predicting ET

_{0}based on radiation datasets. As for the third group, TCN, LSTM, LGB, ANN, RF, and KNN performed better than the P model according to the significant increase in the mean R

^{2}values ranging from 0.045 to 0.074, and a significant reduction in RMSE, 0.158–0.244 mm/d, and MAE, 0.127–0.168 mm/d, respectively. According to these results, radiation-based machine learning models significantly outperformed radiation-based empirical models. Furthermore, compared with other machine learning models, TCN slightly increased in terms of R

^{2}and decreased in terms of RMSE and MAE, indicating that the prediction accuracy of TCN presented the best results among the other machine learning models. Overall, in the first strategy, the TCN model showed better results in predicting ET

_{0}based on radiation datasets.

_{0}values predicted using the radiation-based machine learning models were closer to the ET

_{0}values computed using the FAO-56 PM equation, demonstrating the satisfactory prediction accuracy of the proposed machine learning models. In particular, the TCN (with an R

^{2}of 0.925, RMSE of 0.546 mm/d, and MAE of 0.385) performed comparably better than other machine learning models when calculating ET

_{0}. This demonstrated the TCN model’s excellent potential for ET

_{0}prediction in the first group. For the second group, there was no discernible difference between the ANN and R models at the 0.05 probability level regarding the accuracy of calculating ET

_{0}making use of radiation data, but the MAE of the ANN model was numerically greater than that of the R model. Figure 6 also illustrates that the TCN, LSTM, LGB, and RF models all performed similarly better than the R model, with a significant increase in the mean R

^{2}values ranging from 0.049 to 0.068, and a significant reduction in RMSE, 0.127–0.239 mm/d, and MAE, 0.096–0.163 mm/d, respectively. Furthermore, the TCN model had the highest R

^{2}(0.948), lowest RMSE (0.463 mm/d), and lowest MAE (0.298 mm/d), indicating that it was more accurate than LSTM, LGB, and RF in predicting ET

_{0}in the second group. As for the third group, the ET

_{0}values predicted by ANN, KNN, LGB, and RF were closer to those of the P model, demonstrating that these machine learning models have the same performance as empirical models in predicting ET

_{0}. The TCN and LSTM models were more accurate than the other machine learning and empirical models, according to R

^{2}, RMSE, and MAE performance criteria. Additionally, the LSTM model achieved the highest R

^{2}(0.954), lowest RMSE (0.353 mm/d), and lowest MAE (0.239 mm/d). From the above results, it can be concluded that LSTM performed slightly better than TCN, LGB, and RF, and much better than the empirical models under the input combination of T

_{max}, T

_{min}, and R

_{a}in the third group. In general, considering the overall prediction accuracy of the machine learning models under the combinations based on temperature and radiation data, the TCN model showed better results in terms of stability in the second strategy.

#### 3.3. Humidity-Based Models

^{2}, RMSE, and MAE of humidity-based machine learning and empirical models is summarized in Figure 7. According to Figure 7, the ROM model (with an R

^{2}of 0.839, RMSE of 0.801 mm/d, and MAE of 0.561 mm/d) performed better than the S model in both strategies. Thus, the prediction accuracy of TCN, LSTM, LGB, ANN, RF, and KNN models, based on air temperature and humidity features, was compared with that of the ROM model.

^{2}of the ROM was significantly lower than that of the machine learning models, and the RMSE and MAE of the ROM model were substantially higher than that of machine learning models, demonstrating that the machine learning models based on temperature and humidity data outperformed the empirical formulas. Furthermore, the R

^{2}of TCN was slightly higher than that of other machine learning models, and the RMSE and MAE of TCN were slightly lower than those of other machine learning models, indicating that the performance of TCN was slightly better than the other machine learning models. In conclusion, all six of the machine learning models could outperform empirical equations in terms of accuracy, and the humidity-based TCN model outperformed the others.

^{2}of the ROM model was significantly lower than that of machine learning models, and the RMSE and MAE of the ROM model were substantially higher than those of machine learning models, demonstrating that when it came to calculating ET

_{0}outside of their training region, the machine learning models were well-trained and, based on humidity data, performed better than the empirical equations. In addition, the R

^{2}of TCN (Group I and Group II) and LSTM (Group III) was slightly higher than that of the other machine learning models, and the RMSE and MAE of TCN (Group I and Group II) and LSTM (Group III) were slightly lower than those of other machine learning models, indicating that the performance of TCN (Group I and Group II) and LSTM (Group III) was slightly better than that of the other machine learning models. In conclusion, all six of the proposed machine learning models could outperform empirical equations in terms of accuracy, and the humidity-based TCN model outperformed the others in the first and second groups, while the humidity-based LSTM model outperformed the others in the third group.

## 4. Discussion

#### 4.1. Performance of Temperature-Based Models

_{0}, and is commonly utilized around the world, as it only requires air temperature data as input and has a high accuracy [8]. Since the R

^{2}, RMSE, and MAE values of the ANN and KNN models were almost identical to those of the empirical models in all groups, we found that they had no effect on the accuracy of calculating ET

_{0}in this investigation. The reason for this result might be that the layers of the ANN model were not too deep and the KNN model was relatively simple and did not require parameter estimation, resulting in a weak ability to capture nonlinear interactions between the weather and ET

_{0}, while Antonopoulos and Antonopoulos found that the temperature-based ANN has a larger R

^{2}and lower MAE, outperforming the H equation [4]. Feng and Cui also reported that the ANN model outperforms the MH method [51]. There are two reasons for this opposite result. First, in general, lower values of MAE and RMSE or a higher R

^{2}score cannot indicate a prediction closer to the actual value unless the low or high value is significantly smaller or larger, respectively. Second, there were significant regional differences in how well machine learning models predicted outcomes [27]. The prediction accuracy of TCN was noticeably superior to that of H or MH, despite the fact that LSTM, LGB, and RF did not perform noticeably better than H or MH in any group. Similarly, Chen et al., who stated that TCN presents the most accurate results based on air temperature data among six machine learning models, also came to the same conclusion [27].

_{0}.

#### 4.2. Performance of Radiation-Based Models

_{0}estimation performance. According to several studies, temperature and radiation can account for approximately 80% of the fluctuation in ET

_{0}[53]. In this research, three widely used empirical methods based on temperature and radiation data, namely, Makkink, Ritchie, and Priestley–Taylor, were selected for comparison with machine learning models [51]. The results showed that, under the combinations of temperature and radiation characteristics, all machine learning models were greatly outperformed by radiation-based empirical equations. Many studies have taken an interest in using machine learning models such as ANN, RF, and SVM to predict ET

_{0}in recent years, with restricted meteorological features, and it has been discovered that their performance is superior to that of empirical equations [5,13]. Deep learning models, however, have been infrequently used to estimate ET

_{0}. Numerous research has demonstrated the superior performance of TCN, LSTM, and ANN in sequence problems. As a result, in this work, we modeled daily ET

_{0}utilizing radiation data through these three deep learning models. According to the results, when radiation characteristics were provided, the RMSE and MAE of TCN were lower than those of KNN and ANN in Group I and slightly lower than those of the other machine learning models in general. This result might be due to the reasonable internal structure of TCN, which is more suited than other models to capturing the nonlinear interactions between weather and ET

_{0}.

_{0}with a significantly higher accuracy than empirical equations. This conclusion showed that, outside of the training set of weather stations, machine learning models based on radiation data outperform empirical models. It is noteworthy that in the second strategy, the TCN and LSTM outperform the other machine learning machine models.

#### 4.3. Performance of Humidity-Based Models

_{0}, but further worsened its prediction performance. In contrast to the temperature-based equation, the empirical formula based on temperature and humidity data does not employ extraterrestrial radiation as an input, which might be the cause of this outcome. However, when RH was added as the input variable of the machine learning models, the estimation accuracy of ET

_{0}was significantly improved compared to the temperature-based machine learning models. It is reasonable to give a machine learning model more features to generally improve its accuracy in predicting ET

_{0}.

_{0}than humidity-based empirical models, similar to the results of the radiation-based machine learning models. It has been proven that the performance of the proposed machine learning models was also noticeably better than that of traditional methods at given temperatures and RH characteristics. It is worth noting that the TCN model outperformed all other proposed humidity-based machine learning models, having the highest R

^{2}, the lowest RMSE, and the lowest MAE. The causal and dilated convolutional layers of the TCN model’s internal structure, which have the ability to “remember” previous information, might be the cause of its superior performance.

_{0}using humidity factors beyond the training region, applicable machine learning models can achieve a much greater level of accuracy. The findings demonstrated that, when the inputs of machine learning include humidity features, according to RMSE and MAE performance criteria, the TCN model was more accurate than other machine learning models in Groups I and II, and the LSTM model was more accurate than other machine learning models in Group III. It is worth mentioning that, most importantly, the proposed humidity-based machine learning model has a better performance, suggesting that, in the absence of local meteorological data, the proposed machine learning model could be built using cross-station data with similar meteorological characteristics to estimate the daily ET

_{0}of the target station, which has scarcely been reported in previous studies.

## 5. Conclusions

_{0}estimation under incomplete meteorological data in eastern Inner Mongolia, China. In addition, this study adopted two strategies to evaluate the ET

_{0}prediction performance of the proposed machine learning models: (1) we trained and tested the proposed model separately in every single weather station, and (2), according to the average climate characteristics of eastern Inner Mongolia meteorological stations, they were divided into three groups using the K-means method. Then, each station in each group took turns serving as a validation station, testing the models trained by the other stations within the group. The results demonstrated that (1) in the three groups, the temperature-based TCN model outperformed the empirical method in the accuracy of predicting ET

_{0}in the first strategy, and in the second strategy, temperature-based TCN, LSTM, and LGB models performed significantly or slightly better than the empirical method; (2) in both strategies, all radiation-based machine learning models provided more accurate results than the empirical methods, particularly the TCN model; and (3) in both strategies, all humidity-based machine learning models provided more accurate results than the empirical methods, particularly the TCN model. Most importantly, when only temperature characteristics were available, only the TCN model had an overall greater prediction accuracy than the empirical method based on the calibration temperature in both local and external areas. However, when the radiation or humidity characteristics were added to the given temperature characteristics, all the proposed machine learning models could estimate ET

_{0}, and their accuracy was higher than that of the calibrated empirical equations external to the training study area, which makes it possible to develop an ET

_{0}estimation model for cross-station data with similar meteorological characteristics to obtain a satisfactory ET

_{0}estimation for the target station.

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Kisi, O. Modeling reference evapotranspiration using three different heuristic regression approaches. Agric. Water Manage.
**2016**, 169, 162–172. [Google Scholar] [CrossRef] - Shiri, J. Evaluation of FAO56-PM, empirical, semi-empirical and gene expression programming approaches for estimating daily reference evapotranspiration in hyper-arid regions of Iran. Agric. Water Manage.
**2017**, 188, 101–114. [Google Scholar] [CrossRef] - Tabari, H.; Kisi, O.; Ezani, A.; Hosseinzadeh Talaee, P. SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J. Hydrol.
**2012**, 444, 78–89. [Google Scholar] [CrossRef] - Antonopoulos, V.Z.; Antonopoulos, A.V. Daily reference evapotranspiration estimates by artificial neural networks technique and empirical equations using limited input climate variables. Comput. Electron. Agric.
**2017**, 132, 86–96. [Google Scholar] [CrossRef] - Feng, Y.; Cui, N.; Gong, D.; Zhang, Q.; Zhao, L. Evaluation of random forests and generalized regression neural networks for daily reference evapotranspiration modelling. Agric. Water Manage.
**2017**, 193, 163–173. [Google Scholar] [CrossRef] - Ferreira, L.B.; da Cunha, F.F. New approach to estimate daily reference evapotranspiration based on hourly temperature and relative humidity using machine learning and deep learning. Agric. Water Manage.
**2020**, 234, 106113. [Google Scholar] [CrossRef] - Abdullah, S.S.; Malek, M.A.; Abdullah, N.S.; Kisi, O.; Yap, K.S. Extreme Learning Machines: A new approach for prediction of reference evapotranspiration. J. Hydrol.
**2015**, 527, 184–195. [Google Scholar] [CrossRef] - Hargreaves, G.H.; Samani, Z.A. Reference crop evapotranspiration from temperature. Appl. Eng. Agric.
**1985**, 1, 96–99. [Google Scholar] [CrossRef] - Ahooghalandari, M.; Khiadani, M.; Jahromi, M.E. Developing equations for estimating reference evapotranspiration in Australia. Water Resour. Manage.
**2016**, 30, 3815–3828. [Google Scholar] [CrossRef] - Priestley, C.H.B.; Taylor, R.J. On the assessment of surface heat flux and evaporation using large-scale parameters. Mon. Weather Rev.
**1972**, 100, 81–92. [Google Scholar] [CrossRef] - Jones, J.; Ritchie, J. Crop growth models. In Management of Farm Irrigation Systems; Hoffman, G.J., Howell, T.A., Solomon, K.H., Eds.; ASAE: St. Joseph, MO, USA, 1990; pp. 63–89. [Google Scholar]
- Valiantzas, J.D. Simplified forms for the standardized FAO-56 Penman–Monteith reference evapotranspiration using limited weather data. J. Hydrol.
**2013**, 505, 13–23. [Google Scholar] [CrossRef] - Ferreira, L.B.; da Cunha, F.F.; de Oliveira, R.A.; Fernandes Filho, E.I. Estimation of reference evapotranspiration in Brazil with limited meteorological data using ANN and SVM—A new approach. J. Hydrol.
**2019**, 572, 556–570. [Google Scholar] [CrossRef] - Djaman, K.; Balde, A.B.; Sow, A.; Muller, B.; Irmak, S.; N’Diaye, M.K.; Manneh, B.; Moukoumbi, Y.D.; Futakuchi, K.; Saito, K. Evaluation of sixteen reference evapotranspiration methods under sahelian conditions in the Senegal River Valley. J. Hydol. Reg. Stud.
**2015**, 3, 139–159. [Google Scholar] [CrossRef] - Valipour, M. Application of new mass transfer formulae for computation of evapotranspiration. J. Appl. Water Eng. Res.
**2014**, 2, 33–46. [Google Scholar] [CrossRef] - Majhi, B.; Naidu, D.; Mishra, A.P.; Satapathy, S.C. Improved prediction of daily pan evaporation using Deep-LSTM model. Neural Comput. Appl.
**2019**, 32, 7823–7838. [Google Scholar] [CrossRef] - Fan, J.; Yue, W.; Wu, L.; Zhang, F.; Cai, H.; Wang, X.; Lu, X.; Xiang, Y. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric. For. Meteorol.
**2018**, 263, 225–241. [Google Scholar] [CrossRef] - Mohammadi, B.; Mehdizadeh, S. Modeling daily reference evapotranspiration via a novel approach based on support vector regression coupled with whale optimization algorithm. Agric. Water Manage.
**2020**, 237, 106145. [Google Scholar] [CrossRef] - Chia, M.Y.; Huang, Y.F.; Koo, C.H. Support vector machine enhanced empirical reference evapotranspiration estimation with limited meteorological parameters. Comput. Electron. Agric.
**2020**, 175, 105577. [Google Scholar] [CrossRef] - Wen, X.; Si, J.; He, Z.; Wu, J.; Shao, H.; Yu, H. Support-Vector-Machine-Based Models for Modeling Daily Reference Evapotranspiration With Limited Climatic Data in Extreme Arid Regions. Water Resour. Manage.
**2015**, 29, 3195–3209. [Google Scholar] [CrossRef] - Shiri, J. Improving the performance of the mass transfer-based reference evapotranspiration estimation approaches through a coupled wavelet-random forest methodology. J. Hydrol.
**2018**, 561, 737–750. [Google Scholar] [CrossRef] - Rahimikhoob, A. Comparison between M5 Model Tree and Neural Networks for Estimating Reference Evapotranspiration in an Arid Environment. Water Resour. Manage.
**2014**, 28, 657–669. [Google Scholar] [CrossRef] - Kisi, O.; Kilic, Y. An investigation on generalization ability of artificial neural networks and M5 model tree in modeling reference evapotranspiration. Theor. Appl. Climatol.
**2015**, 126, 413–425. [Google Scholar] [CrossRef] - Yan, S.; Wu, L.; Fan, J.; Zhang, F.; Zou, Y.; Wu, Y. A novel hybrid WOA-XGB model for estimating daily reference evapotranspiration using local and external meteorological data: Applications in arid and humid regions of China. Agric. Water Manage.
**2021**, 244, 106594. [Google Scholar] [CrossRef] - Maroufpoor, S.; Bozorg-Haddad, O.; Maroufpoor, E. Reference evapotranspiration estimating based on optimal input combination and hybrid artificial intelligent model: Hybridization of artificial neural network with grey wolf optimizer algorithm. J. Hydrol.
**2020**, 588, 125060. [Google Scholar] [CrossRef] - Patil, A.P.; Deka, P.C. An extreme learning machine approach for modeling evapotranspiration using extrinsic inputs. Comput. Electron. Agric.
**2016**, 121, 385–392. [Google Scholar] [CrossRef] - Chen, Z.; Zhu, Z.; Jiang, H.; Sun, S. Estimating daily reference evapotranspiration based on limited meteorological data using deep learning and classical machine learning methods. J. Hydrol.
**2020**, 591, 125286. [Google Scholar] [CrossRef] - Yin, J.; Deng, Z.; Ines, A.V.M.; Wu, J.; Rasu, E. Forecast of short-term daily reference evapotranspiration under limited meteorological variables using a hybrid bi-directional long short-term memory model (Bi-LSTM). Agric. Water Manage.
**2020**, 242, 106386. [Google Scholar] [CrossRef] - Petković, B.; Petković, D.; Kuzman, B.; Milovančević, M.; Wakil, K.; Ho, L.S.; Jermsittiparsert, K. Neuro-fuzzy estimation of reference crop evapotranspiration by neuro fuzzy logic based on weather conditions. Comput. Electron. Agric.
**2020**, 173, 105358. [Google Scholar] [CrossRef] - Shan, X.; Cui, N.; Cai, H.; Hu, X.; Zhao, L. Estimation of summer maize evapotranspiration using MARS model in the semi-arid region of northwest China. Comput. Electron. Agric.
**2020**, 174, 105495. [Google Scholar] [CrossRef] - Mehdizadeh, S. Estimation of daily reference evapotranspiration (ETo) using artificial intelligence methods: Offering a new approach for lagged ETo data-based modeling. J. Hydrol.
**2018**, 559, 794–812. [Google Scholar] [CrossRef] - Dou, X.; Yang, Y. Evapotranspiration estimation using four different machine learning approaches in different terrestrial ecosystems. Comput. Electron. Agric.
**2018**, 148, 95–106. [Google Scholar] [CrossRef] - Feng, Y.; Peng, Y.; Cui, N.; Gong, D.; Zhang, K. Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput. Electron. Agric.
**2017**, 136, 71–78. [Google Scholar] [CrossRef] - Karbasi, M. Forecasting of Multi-Step Ahead Reference Evapotranspiration Using Wavelet- Gaussian Process Regression Model. Water Resour. Manage.
**2018**, 32, 1035–1052. [Google Scholar] [CrossRef] - Trajkovic, S. Hargreaves versus Penman-Monteith under Humid Conditions. J. Irrig. Drain. Eng.
**2007**, 133, 38–42. [Google Scholar] [CrossRef] - Dorji, U.; Olesen, J.E.; Seidenkrantz, M.S. Water balance in the complex mountainous terrain of Bhutan and linkages to land use. J. Hydol. Reg. Stud.
**2016**, 7, 55–68. [Google Scholar] [CrossRef] - Makkink, G.F. Testing the Penman Formula by Means of Lysimeters. J. Inst. Water Eng.
**1957**, 11, 277–288. [Google Scholar] - Citakoglu, H.; Cobaner, M.; Haktanir, T.; Kisi, O. Estimation of monthly mean reference evapotranspiration in Turkey. Water Resour. Manage.
**2014**, 28, 99–113. [Google Scholar] [CrossRef] - Allen, R.; Pereira, L.; Raes, D.; Smith, M.; Allen, R.G.; Pereira, L.S.; Martin, S. Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements; FAO: Rome, Italy, 1998. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Earnest, A.; Tan, S.B.; Wilder-Smith, A. Meteorological factors and El Niño Southern Oscillation are independently associated with dengue infections. Epidemiol. Infect.
**2012**, 140, 1244–1251. [Google Scholar] [CrossRef] - Kohli, S.; Godwin, G.T.; Urolagin, S. Sales Prediction Using Linear and KNN Regression. In Proceedings of the Advances in Machine Learning and Computational Intelligence, Singapore, 6–7 April 2019; pp. 321–329. [Google Scholar]
- Shah, K.; Patel, H.; Sanghvi, D.; Shah, M. A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification. Augment. Hum. Res.
**2020**, 5, 12. [Google Scholar] [CrossRef] - Lee, T.; Ouarda, T.B.M.J.; Yoon, S. KNN-based local linear regression for the analysis and simulation of low flow extremes under climatic influence. Clim. Dyn.
**2017**, 49, 3493–3511. [Google Scholar] [CrossRef] - Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 3149–3157. [Google Scholar]
- Cai, J.; Li, X.; Tan, Z.; Peng, S. An assembly-level neutronic calculation method based on LightGBM algorithm. Ann. Nucl. Energy
**2021**, 150, 107871. [Google Scholar] [CrossRef] - Yu, Q.; Guan, X.; Zhai, Y.; Meng, Z. The missing data filling method of the industrial internet platform based on rules and lightGBM. IFAC-PapersOnLine
**2020**, 53, 152–157. [Google Scholar] [CrossRef] - Sun, X.; Liu, M.; Sima, Z. A novel cryptocurrency price trend forecasting model based on LightGBM. Financ. Res. Lett.
**2020**, 32, 101084. [Google Scholar] [CrossRef] - Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] - Lea, C.; Flynn, M.D.; Vidal, R.; Reiter, A.; Hager, G.D. Temporal Convolutional Networks for Action Segmentation and Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 21–26 July 2017; pp. 1003–1012. [Google Scholar]
- Feng, Y.; Cui, N.; Zhao, L.; Hu, X.; Gong, D. Comparison of ELM, GANN, WNN and empirical models for estimating reference evapotranspiration in humid region of Southwest China. J. Hydrol.
**2016**, 536, 376–383. [Google Scholar] [CrossRef] - Reis, M.M.; da Silva, A.J.; Zullo Junior, J.; Tuffi Santos, L.D.; Azevedo, A.M.; Lopes, É.M.G. Empirical and learning machine approaches to estimating reference evapotranspiration based on temperature data. Comput. Electron. Agric.
**2019**, 165, 104937. [Google Scholar] [CrossRef] - Samani, Z. Estimating solar radiation and evapotranspiration using minimum climatological data. J. Irrig. Drain. Eng.
**2000**, 126, 265–267. [Google Scholar] [CrossRef]

**Figure 3.**The performance of the temperature-based models during the first strategy: (

**a**) determination coefficient (R

^{2}), (

**b**) root mean square error (RMSE), and (

**c**) mean absolute error (MAE). Note: Values on the same line with different lowercase letters are significantly different at the 5% probability level. The same is shown below.

**Figure 4.**The performance of the temperature-based models during the second strategy: (

**a**) determination coefficient (R

^{2}), (

**b**) root mean square error (RMSE), and (

**c**) mean absolute error (MAE).

**Figure 5.**The performance of the radiation-based models during the first strategy: (

**a**) determination coefficient (R

^{2}), (

**b**) root mean square error (RMSE), and (

**c**) mean absolute error (MAE).

**Figure 6.**The performance of the radiation-based models during the second strategy: (

**a**) determination coefficient (R

^{2}), (

**b**) root mean square error (RMSE), and (

**c**) mean absolute error (MAE).

**Figure 7.**The performance of the humidity-based models during the first strategy: (

**a**) determination coefficient (R

^{2}), (

**b**) root mean square error (RMSE), and (

**c**) mean absolute error (MAE).

**Figure 8.**The performance of the humidity-based models during the second strategy: (

**a**) determination coefficient (R

^{2}), (

**b**) root mean square error (RMSE), and (

**c**) mean absolute error (MAE).

**Table 1.**Geographic and meteorological information of the 18 weather stations and the cluster number of weather stations during the period 1960–2019.

Station | U_{2}(m·s ^{−1}) | RH (%) | SH (h) | T_{min}(°C) | T_{max}(°C) | p (mm) | Cluster |
---|---|---|---|---|---|---|---|

Eergunaqi | 2.06 | 66.41 | 7.26 | −8.66 | 4.61 | 1.12 | 3 |

Tulihe | 2.08 | 70.79 | 6.93 | −12.45 | 4.36 | 1.42 | 3 |

Manzhouli | 3.99 | 62.03 | 8.08 | −6.97 | 6.35 | 0.90 | 2 |

Hailaer | 3.22 | 66.12 | 7.39 | −6.67 | 5.55 | 1.09 | 2 |

Xiaoergou | 1.56 | 66.25 | 7.31 | −7.31 | 8.39 | 1.57 | 3 |

Xinbaerhuyouqi | 3.76 | 59.39 | 8.35 | −4.43 | 7.82 | 0.73 | 2 |

Xinbaerhuzuoqi | 3.27 | 62.16 | 7.97 | −5.12 | 6.80 | 0.89 | 2 |

Zhalantun | 2.68 | 56.64 | 7.58 | −2.18 | 9.66 | 1.55 | 2 |

Aershan | 2.49 | 68.64 | 7.15 | −9.30 | 4.84 | 1.50 | 3 |

Suolun | 2.82 | 56.82 | 7.74 | −3.88 | 10.38 | 1.40 | 2 |

Zhaluteqi | 2.70 | 48.23 | 7.90 | 1.27 | 13.28 | 1.13 | 1 |

Balinzuoqi | 2.66 | 50.02 | 8.31 | −1.04 | 12.92 | 1.11 | 1 |

Linxi | 2.83 | 49.58 | 8.09 | −1.25 | 11.60 | 1.10 | 1 |

Kailu | 3.83 | 51.80 | 8.48 | 0.85 | 13.44 | 0.98 | 1 |

Tongliao | 3.56 | 54.34 | 8.18 | 1.24 | 13.29 | 1.12 | 1 |

Wengniuteqi | 2.95 | 47.69 | 8.20 | 0.41 | 13.03 | 1.05 | 1 |

Chifeng | 2.42 | 48.17 | 8.01 | 1.54 | 14.47 | 1.10 | 1 |

Baoguotu | 3.23 | 49.94 | 7.99 | 1.59 | 13.71 | 1.23 | 1 |

_{2}, RH, SH, T

_{max}, T

_{min}, and P are the average daily wind speed at 2 m height, relative humidity, sunshine duration, maximum and minimum air temperature, respectively.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhang, H.; Meng, F.; Xu, J.; Liu, Z.; Meng, J.
Evaluation of Machine Learning Models for Daily Reference Evapotranspiration Modeling Using Limited Meteorological Data in Eastern Inner Mongolia, North China. *Water* **2022**, *14*, 2890.
https://doi.org/10.3390/w14182890

**AMA Style**

Zhang H, Meng F, Xu J, Liu Z, Meng J.
Evaluation of Machine Learning Models for Daily Reference Evapotranspiration Modeling Using Limited Meteorological Data in Eastern Inner Mongolia, North China. *Water*. 2022; 14(18):2890.
https://doi.org/10.3390/w14182890

**Chicago/Turabian Style**

Zhang, Hao, Fansheng Meng, Jia Xu, Zhandong Liu, and Jun Meng.
2022. "Evaluation of Machine Learning Models for Daily Reference Evapotranspiration Modeling Using Limited Meteorological Data in Eastern Inner Mongolia, North China" *Water* 14, no. 18: 2890.
https://doi.org/10.3390/w14182890