Evaluation of 32 Simple Equations against the Penman–Monteith Method to Estimate the Reference Evapotranspiration in the Hexi Corridor, Northwest China

: Evapotranspiration plays an inevitable role in various fields of hydrology and agriculture. Reference evapotranspiration ( ET 0 ) is mostly applied in irrigation planning and monitoring. An accurate estimation of ET 0 contributes to decision and policymaking processes governing water resource management, efficiency, and productivity. Direct measurements of ET 0 , however, are difficult to achieve, often requiring empirical methods. The Penman–Monteith FAO56 (PM-FAO56) method, for example, is still considered to be the best way of estimating ET 0 in most regions of the globe. However, it requires a large number of meteorological variables, often restricting its applicability in regions with poor or missing meteorological observations. Furthermore, the objectivity of some elements of the empirical equations often used can be highly variable from region to region. The result is a need to find an alternative, objective method that can more accurately estimate ET 0 in regions of interest. This study was conducted in the Hexi corridor, Northwest China. In it we aimed to evaluate the applicability of 32 simple empirical ET 0 models designed under different climatic conditions with different data inputs requirements. The models evaluated in this study are classified into three types of methods based on temperature, solar radiation, and mass transfer. The performance of 32 simple equations compared to the PM-FAO56 model is evaluated based on model evaluation techniques including root mean square error (RMSE), mean absolute error (MAE), percentage bias ( PBIAS) , and Nash–Sutcliffe efficiency (NSE). The results show that the World Meteorological Organization (WMO) and the Mahringer (MAHR) models perform well and are ranked as the best alternative methods to estimate daily and monthly ET 0 in the Hexi corridor. The WMO and MAHR performed well with monthly mean RMSE = 0.46 mm and 0.56 mm, PBIAS = 12.1% and −11.0%, and NSE = 0.93 and 0.93, before calibration, respectively. After calibration, both models showed significant improvements with approximately equal PBIAS of −2.5%, NSE = 0.99, and RMSE of 0.24 m. Calibration also significantly reduced the PBIAS of the Romanenko (ROM) method by 82.12% and increased the NSE by 16.7%.


Introduction
Reference evapotranspiration (ET0) is an essential element of the hydrological cycle, energy, and water balance [1,2]. It plays a crucial role in the fields of agricultural and hydrological projects [3,4]. An accurate estimation of ET0 is a hydrological requirement for accurate estimates of water resource management, efficiency, and productivity, particularly in semi-arid regions [5,6].
The lysimetric method is one of the micrometeorological techniques used to measure in situ ET0 values, and it is often considered the sole method of achieving accurate ET0 estimates. This method, however, has great shortcomings associated with high costs and complex instrumentation [7,8]. A variety of empirical methods have thus been developed for the task according to different climate conditions [9][10][11][12].
The Food and Agriculture Organization (FAO) has recommended the Penman-Monteith (PM-FAO56) equation as the standard model to estimate the ET0 under various climate conditions and different time scales [13,14]. Research has shown the PM-FAO56 method to be suitable for a variety of climates with differing local factors. These include solar radiation, sunshine duration, wind speed, air humidity, air temperature, and location of observing station properties [15][16][17]. The model, however, requires a very large number of inputs, many of which are difficult to accurately estimate or observe in regions with few observations, such as those of wind speed, relative humidity, and solar radiation. A requirement to develop possible alternatives to escape some of these requirements could thus yield benefits in the development of the method [5,7,[18][19][20].
The FAO also recommends an application of the Hargreaves and Samani (H-S) method of estimating ET0 values in regions where only observations of minimum and maximum air temperature [12] are available. However, many studies have found that the H-S method encounters uncertainties associated with regional climate conditions and small timescales [6,18]. Calibration of its inputs is therefore required before the method can be used reliably [6,18]. In light of these drawbacks, extensive research has been carried out to develop alternative, more simple equivalent methods [6][7][8][19][20][21].
Simple empirical ET0 models that use fewer climatic variables than those required by PM-FAO56 have been developed and showed relatively good performance [22]. Although not all models are synchronously applicable in all different regions, local calibration is indeed applied to reparametrize and adjust the model to local climate conditions [18,23,24]. Berti et al. (2014) [6] evaluated the performance of the H-S method in Italy and concluded that the H-S model overestimates ET0 values. Similar conclusions were obtained in Eastern North Carolina, America [25], Southeast Europe [21], and in Southwest China [26]. Significant improvements in H-S model performance in different regions have been gained through local calibration [6,18,23,[26][27][28][29][30]. Gao et al. (2017) [31] evaluated different ET0 methods in arid, semiarid, and humid regions. They recommended the Priestly-Taylor, H-S, and Makkink models as the best substitutes for the PM-FAO56 model in arid, semi-arid, and humid regions of China, respectively [31][32][33]. The comparative studies of the performance of different simple ET0 models have been conducted in different regions of China [32][33][34][35][36][37][38][39][40][41]. Song et al. (2019) assessed the performance of twelve simple ET0 models in Northeast China during growing seasons and recommended the Valiantzas [42], Romanenko [43], and Makkink [44] models as substitutes for the PM-FAO56 [40]. An assessment of ten simple ET0 models based on local climates in China found the Berit et al. (2014) [6] method to be the best PM-FAO56 alternative for China [39]. Two other models (Turk [45] and Valiantzas [46]) were also found to give robust results in subtropical humid, monsoonal regions of China, although caveats persisted [31,37], particularly regarding evaluation and calibration. The current study aimed to evaluate the performance of 32 simple ET0 models in the arid region of the Hexi Corridor, Northwest China. This work is an additional key to further understanding the modeling of reference evapotranspiration and water resource management in inland river basins.

Geography and Climate of the Hexi Corridor
The elongated Hexi corridor is geographically situated between the latitudes 37°17′ N and 42°48′ N and longitudes 92°12′ E and 104°20 E with an elevation of 800-5800 m ( Figure 1). The corridor in Gansu province, China, is bounded by the Qilian Mountains to the south, and Mazong, Heli, and Longshou to the north extending from the Wushaoling mountain in the East to the Yumenguan in the west, and connects Northwest China to Xinjiang province. The area covered is 2.7 × 10 5 km 2 , and is approximately 11.5% of the northwest region [47][48][49]. The corridor is primarily known as the source of dust in the Chinese loess [50]. It is also a major source of China's wheat supply [51], as well as millet and corn [51,52]. Irrigation is essential due to low annual average precipitation in the region, as the result of a dominant westerly wind. The annual mean rainfall fluctuates between 50 and 550 mm [49]. Regional agriculture is found in oases distributed in three inland river basins, namely the Shiyang river basin (SYRB), Heihe river basin (HRB), and Shule river basin (SLRB), named after the three inland rivers located in the region, whose sources are in Qilian mountains. Moreover, the Hexi corridor has a high annual atmospheric water demand averaging from 1500 to 2500 mm [53].

Data and Source of Materials
Data used in this study were collected from thirteen meteorological stations distributed across the Hexi corridor ( Figure 1). Daily observations were provided by the China Meteorological Administration (CMA) and include minimum, mean, and maximum air temperature (Tmin, Tmean, and Tmax), minimum and mean relative humidity (Rhmin and Rhmean), wind speed measured at 10 m height (U10m), and sunshine duration (SSD) for the period 1960-2017. Wind speeds, measured at 10 m height (also from CMA) and an assumed wind profile relationship were used to estimate mean values of wind speeds at 2 m height [12]. Maximum relative humidity (Rhmax) was estimated from mean and minimum relative humidity. Solar radiation (Rs) was estimated from sunshine duration (SSD) using equation (37). Table 1 shows the properties of thirteen meteorological stations used in this study and summarizes the monthly means of the main climatic variables of each station in the Hexi corridor, observed during 1960-2017. Table 1. Properties of thirteen meteorological stations with long-term average climatic conditions. minimum, mean, and maximum air temperature (Tmin, Tmean, and Tmax), mean, maximum, and minimum relative humidity (Rhmean, Rhmax and RHmin), wind speed measured at 10 m height (U10m), sunshine duration (SSD), and solar radiation (Rs).

Penman-Monteith Method
The FAO recommends the Penman-Monteith (FAO-PM56) method to estimate the daily ET0 from climatic variables [12], as shown in equation (1)

Simple ET0 Equations
To evaluate suitable, alternative methods to PM-FAO56, requiring fewer inputs while retaining results suitable for the Hexi corridor, this study selected 32 simple ET0 equations, classified into three categories: (1) temperature-based [6][7][8]19,43,55], (2) solar radiation-based [18,[55][56][57][58][59], and (3) mass transfer-based [60][61][62]. The temperature-based methods are the most widely used to estimate the ET0 due to their relative simplicity and requirements of fewer inputs [63]. The radiation-based methods are also mostly used to estimate the ET0 at the global and regional scales. This study has also evaluated the performance of mass transfer methods compared to the PM-FAO56 method. The mass transfer (aerodynamic)-based methods are pioneers of empirical models to estimate evapotranspiration and originate from the method proposed by Dalton (1802) [62]. Previous studies showed that the mass transfer methods are built on the concept of eddy motion transfer of water vapor from the evaporating surface to the atmosphere [63,64]. Table 2 lists the ET0 models used in this study and their respective data requirements.
The ground heat flux at a daily time scale is ignored ( whereas at a monthly time scale, G is delivered from monthly mean temperature [54]. where m and 1 m  are the month order. The net surface radiation ( R n ) is obtained from the difference between the net short radiation ( ns R ) and the net long radiation( nl R ) and expressed in equation (35) below: The net short radiation ( ns R ) is deduced from the surface albedo ( 0.23

 
) and solar radiation ( Rs ) shown in equation (36): As a direct measurement is missing, the solar radiation is derived from sunshine duration using the Hargreaves method shown in equation (37) [12,22]. This method has been widely used in numerous studies conducted in Northwest China and showed a good agreement with available observations [39].

Model Evaluation, Selection, and Calibration
The performance of simple ET0 equations to estimate the daily and monthly ET0 values in the Hexi region was assessed through model evaluation techniques based on evaluating errors and regression metrics.
Error indices are among the regression metrics commonly used to evaluate models. In this study we selected mean absolute error (MAE) and root mean square error (RMSE) to compare errors between ET0 estimated from the PM-FAO56 method and that computed from simple ET0 equations. For both indices, values close to 0 were taken to be a measure of perfect model performance, similar to an approach taken by Singh et al. (2004) [72,73]. The linear regression coefficient (slope) was also used to indicate how well the ET0 values computed from simple models match the ET0 values estimated from the PM-FAO56 method. The regression metrics were also extended to the coefficient of determination (R 2 ) to indicate the degree of agreement.
We also used the Nash-Sutcliffe efficient (NSE) method to evaluate the degree of fit between the PM-FAO56 method and the simple ET0 models.  [73]. Percentage bias (PBIAS) was adopted to explain the percentage of errors associated with model performance. PBIAS = 0% is an optimal value for the best model. A negative or positive sign indicates that the model overestimates or underestimates the ET0 values, respectively.
The formulation of the slope, MAE, RMSE, NSE, and PBIAS metrics are as follows: where pm mean ET is the mean calculated as follows: Based on evaluation metrics, the selected models were further calibrated. Similarly to the previous studies, this study used the regression-based with an omitted intercept method [74][75][76] to calibrate each selected model. The calibration process follows the following expression: The coefficient  stands for the linear regression coefficient estimated from the ratio 0 0 / pm eq ET ET according to Xu and Sigh (1998) [65]. The time series of 1960-2017 was divided into two parts: 80% of the time series  was used to compute the  coefficient, and 20% of the time series (2000-2017) was used to validate the ET0 models. The calibration process relies on turning the constant values of the models in order to enhance their performances [76,77]. For each model, a constant value is changed to maximize the NSE and minimize the MAE, RMSE, and PBIAS. The results from the calibration procedure were assessed by the evaluation metrics (MAE, RMSE, PBIAS, and NSE) used in this study. The values of the  coefficient were calculated from the calibration data, and then the obtained  values were adopted for the testing and validation time series [75].

Performance of the Simple ET0 Models
Statistical metrics for the Hexi corridor obtained from comparisons between ET0 calculated from PM-FAO56 and 32 alternatives are shown in Figures 2 and 3 for daily and monthly timescales, respectively. The MAHR and WMO models appear to show very good performance at all stations in the region. Figure 2 shows that both models estimated the daily ET0 values with relatively low MAE, RMSE, PBIAS, and significant NSE coefficients (Figure 2). On a daily basis, the WMO showed a relatively low range of MAE and RMSE bounded between 0.31-0. 43  The comparison between the PM-FAO56 and radiation-based methods showed that the selected models generally underestimated the daily ET0, except for the ABT1 model. An underestimation of ET0 estimates from radiation-based models is shown by the large and positive daily and monthly mean values of PBSIAS ranging from 5.0% to 78.6% and 13.5% to 80.4%, respectively. Moreover, the solar radiation-based methods encountered higher mean values of MSE and RMSE (Figures 2 and 3, respectively). The OUD, P-T, and MAK models strongly underestimated the daily ET0 by the mean PBIAS of 38.4%, 59.6%, and 77,1%, respectively ( Figure 2). Those models also underestimated the monthly ET0 values by 64.6%, 80.4%, 81.3%, respectively. The P-T and MAK methods used in this study were originally developed for a humid climate. They are significant for a 10 days or longer timescale, which may be the reason of their poor performance in the arid region [78]. Moreover, Tabari et al. (2011) showed that P-T underestimates the ET0 in cold and arid regions [18]. By contrast, the ABT1 model overestimated the daily ET0 values with a mean PBIAS of −14.9% and monthly ET0 values with a mean PBIAS of −6.5%.  The temperature-based methods showed relatively lower mean MAE and RMSE values than those estimated from the solar radiation and mass transfer-based methods. The daily and monthly mean MAE values estimated between the PM-FAO56 and temperature models are in the range of 0.48-6.6 mm/day and 0.41-4.5 mm, respectively. The daily and monthly mean RMSE values also vary in the range of 0.65-6.9 mm/day and 0.5-4.6 mm, respectively. However, the BRO method underestimated the ET0 values with the highest daily and monthly mean MAE and RMSE average to 2.7 and 3.2 mm/day and 2.6 and 2.8 mm, respectively, which led this model to perform with the lowest R 2 and NSE values (Figures 2 and 3). The poor performance is due to the extreme values observed during the freezing period (December-April) [22]. It has been shown that the temperature-based methods are more sensitive to weather conditions and the BRO method showed to perform well under a temperature range of 11-22.5 °C [20]. Other temperature-based methods underestimated the daily and monthly ET0 values, including BERT, DORJ, and TRAJ, with the mean PBIAS ranging between 19.8% and 47.9% (Figure 2). The method proposed by Berti et al. (2014) underestimated the ET0 values with mean a PBIAS of 19.8% and 22.9% on a daily and monthly basis, respectively. Moreover, the strong underestimation of the daily and monthly ET0 values was estimated at the Mazongshan station with a PBIAS of 44.7% and 46.8%, respectively. The DAL1 and DAL2 methods showed good performance with a relatively lower daily mean PBIAS of −0.34% and 1.9% at and monthly mean PBIAS of 6.1% and 7.9%, respectively. Moreover, the large number of temperaturebased models presented the greater slope and R 2 values than that which resulted from the solar radiation methods (Figures 2 and 3). . The low performance in these three models is associated with underestimating the daily ET0 at a large number of stations in the Hexi corridor. The solar radiation-based methods showed a relatively low performance compared to temperature-based and mass transfer-based methods. The HARG, IRM1, IRM2, and TAB3 models showed satisfactory performance in the Heihe river basin (Figure 4). Figure 5 shows that most ET0 models are robust on the large timescale. The ROM model [43] showed a robust performance at more than 80% of the stations in the Hexi corridor ( Figure 5). Among the temperature-based methods, the TAB2, AHO1, H-S, DAL2, and DAL1 models showed a significant performance compared to their corresponding daily values. The DAL1 showed a good estimate of the monthly ET0 with a RMSE averaged to 0.89 mm and a relatively high performance (NSE = 0.81), while TAB2 and AHO1 performed well in the Shule river basin. In general, significant improvements were observed in a large number of temperature-based methods, concentrated in the middle reach of the Heihe river basin and Shiyang river basin. The BRO method persisted in poor performance with higher MAE (2.63 mm/day) and RMSE (2.77 mm/day) and underestimated the ET0 values by more than 50% (Figure 3). The solar radiation-based methods showed improved model performance, particularly the HARG, IRM1, IRM2, and TAB3, which were most robust in the middle reach of the Heihe river basin (Figures 3 and 5). Moreover, a summary of statistical metrics averaged at each basin is presented in Supplementary Table S1. It includes the average MAE, RMSE PBIAS, and NSE values for each model compared to the PM-FAO56 method at both daily and monthly time steps.  Table 2).  Table 2).

Cross-Comparison of the ET0 Models
The cross-comparison of the 32 models aimed to distinguish the models with the best performance at each river basin of the Hexi corridor. The mean NSE > 0.75 was taken to be the threshold condition of model selection. From Table S1, the two models (MAHR and WMO) satisfied the conditions of NSE coefficients >0.75 at the daily timescale in all basins. The MAHR showed better NSE values of 0.91, 0.93, and 0.94 for the Shiyang river basin (SYRB), Shule river basin (SLRB), and Heihe river basin (HRB), respectively. The WMO was also found to estimate the ET0 values with the significant NSE values of 0.97, 0.96, and 0.95 for SYRB, SLRB, and HRB, respectively (Table S1).
A similar condition was applied to the monthly timescale. The results show 13 models that comply with the conditions of NSE coefficients and >0.75. Figure 6 depicts the 13 models selected based on best performance (NSE > 0.75) in the three inland river basins of the Hexi corridor. The WMO, MAHR, ROM, and DAL1 models ranked as the best methods and showed a very good performance in all three inland river basins. Moreover, DAL1 resulted in low mean PBIAS values of 1.1% and −3.58% for the HRB and SYRB, respectively. The AHO1 model showed the best performance in the SLRB, with the monthly mean NSE and PBIAS of 0.9% and 2.5%, respectively. However, this model is attributed to an overestimation of ET0 with PBIAS averaged to −18.5% and −25.4% in the HRB (Table S1). The H-S, DAL2, HARG, and BERT methods revealed good performance in the SYRB (PBIAS = −0.68%, NSE = 0.77) and the middle reach of the HRB (PBIAS 2.87%, NSE = 0.85). The TAB1, TAB2, and AHO2 are suitable for the SLRB, with monthly mean NSE values of 0.81, 0.83, and 0.82, respectively. The IRM2 performed well in the Heihe river basin only, with NSE = 0.78 and PBIAS = 9.3%. The results analyzed above (Figures 2-6) were obtained before the model calibration. However, numerous studies have suggested model calibration to adjust ET0 results to local climate conditions [78][79][80][81].

Calibration of the ET0 Models
In this study, we calibrated models that resulted from the cross-comparison process. The ET0 values from 13 models resulting from cross-comparison were selected to be calibrated. Model results were calibrated against the ET0 estimates from the PM-FAO56 method on a monthly timescale. Figure  7 compares the monthly performances of selected models compared with PM-FAO56 before and after calibration. The regression coefficients used to change the model parameters at each station are shown in Supplementary Table S2. The calibration process significantly improved the 13 models mentioned above. In fact, it significantly reduced the PBIAS of the ROM method by 82.12% and increased the NSE by 16.7%. The NSE values of the AHO1, IRM2, HARG, and DAL1 methods improved by 33.6%, 19.8%, 13.4%, and 9.3%, respectively, after calibration. The calibration also improved the PBIAS of the AHO2, DAL2, BERT, and H-S methods by 82%, 78.2%, 62.9%, and 18.2%, respectively, after calibration.
An overall improvement in most models was noted in PBIAS values of less than 15% after calibration. The calibration results show that the WMO and MAHR methods remain robust in the Hexi region with the lower mean PBIAS values of −2.5% and −2.6% respectively. The robust performance of the WMO, MAHR, and ROM methods can be explained by the sensitivity of ET0 to the variation of relative humidity in the Hexi corridor [82].
A time series for 2000-2017 was used to validate the calibrated methods. Figure 8 compares the results of the 13 validated methods and PM-FAO56. The results reveal that the WMO, MAHR, and ROM methods remain the best substitute models to estimate the ET0 in the Hexi corridor. They showed robust NSE coefficients of 0.98, 0.98, and 0.95, respectively, and lower MAE and RMSE values than the other validated models. Their RMSE values ranged from 0.15 to 0.43 mm, from 0.20 to 0.50 mm, and from 0.29 to 0.96 mm after validation, respectively. Comparative studies of different ET0 models against the PM-FAO56 have been documented [73,81,82]. A comparison between the PM-FAO56 method and the 34 ET0 methods showed that the WMO, Droogers, Allen, and Ahooghalandari models performed very well in the semi-arid region of New Mexico [4]. Tabari et al. (2013) found that the Romanenko (ROM) model performed well to estimate ET0 values in the humid climate of Iran [78]. A low performance of the IRM1 method was also reported in Eastern Africa [83]. Peng et al. (2017) evaluated 10 ET0 equations and recommended the Berti method as the best alternative method to estimate monthly ET0 in mainland China [39]. Gao et al. (2017) recommended the Priestley and Taylor model to be the best substitute of the PM-FAO56 in the arid region of Northwest China [32]; however, in the current study, the Priestley and Taylor model did not show a direct promising application in the Hexi corridor.
Previous studies have also shown the best performance of both the Mahringer (MAHR) and WMO models in different regions [77,84,85]. An assessment of 16 ET0 models reported that the MAHR model showed relatively good performance compared to the PM-FAO56 method in the Senegal river basin [84,86]. An overestimation of ET0 values from the MAHR model was observed at some stations of New Mexico, USA [4]. The WMO underestimated ET0 values, which is consistent with the previous results obtained in Malaysia [87]. Shiri (2018) reported poor performance of mass transfer methods in southern Iran, and found that the calibration process improved their performance [74]. The poor performance of the Trabert and Jensen-Haise models in the Hexi corridor is consistent with that reported by Meng Li et al. (2018) in Eastern China [88].
Numerous studies evaluated the Hargreaves and Samani equation and have suggested local calibration to adjust the model to local climate conditions [26,89,90]. Tabari and Talee (2011) showed that the original Hargreaves method underestimated ET0 values, and the calibration of its original empirical coefficient from 0.0023 to 0.0031 improved the model performance in the cold and arid regions of Iran [18].

Conclusions
The performance of 32 simple ET0 alternatives, developed based on three approaches (temperature, radiation, and mass transfer-based) to the evapotranspiration produced by the PM-FAO56 method, was assessed for the Hexi Corridor in Northwest China. From our assessment, the World Meteorological Organization (WMO) and Mahringer (MAHR) methods are the most robust. However, the Romanenko (ROM) model is also a good substitute for PM-FAO56, especially in the middle reach of the Heihe river basin and Shiyang river basin. Among the temperature-based methods, the Ahooghalandari (AHO1), Tabari (TAB2) models performed well in the Shule river basin, and the Hargreaves method and its derived equations presented the best performance in the middle reach of the Heihe river basin. A large number of the mass transfer-based methods performed poorly, overestimating ET0 values. The poor performance of the solar radiation-based methods is subjected to an underestimation of ET0 values. Many simple ET0 models tend to perform well on a large timescale basis. Calibration/validation significantly improved all selected models. The results from the calibration procedure of 13 models on a monthly time scale show that the WMO, MAHR, ROM, AHO1, AHO2, DAL1, HARG, IRM2, TRAB, H-S, BERT, TAB1, and DAL2 methods are the best substitute to the PM-FAO56 method for estimating the ET0 in the Hexi corridor.
Moreover, models that integrate temperature, relative humidity, and wind speed (WMO and MAHR) were ranked the best, followed by models that integrate the temperature and relative humidity (ROM, AHO1, and AHO2). The results of this study will be beneficial for selecting the simple ET0 method appropriate for the Hexi corridor and its inland river basins, as well as the local weather stations. When temperature, relative humidity, and wind speed data are available, the WMO method can be used to estimate ET0 values in the Hexi corridor. In the case of missing wind speed and solar radiation, the ROM method can be adopted for estimating ET0 values. The adoption of the DAL1 method is recommended when only temperature data are available.

Supplementary Materials:
The following are available online at www.mdpi.com/2073-4441/12/10/2772/s1, Table  S1: A summary of mean statistical metrics averaged for each river basin before calibration. Table S2: Regression coefficients (Ψ) used to calibrate and validate the 13 ET0 models on a monthly timescale.

Conflicts of Interest:
The authors declare no conflict of interest.