Modern Techniques to Modeling Reference Evapotranspiration in a Semiarid Area Based on ANN and GEP Models

Evapotranspiration (ET) is a significant aspect of the hydrologic cycle, notably in irrigated agriculture. Direct approaches for estimating reference evapotranspiration (ET0) are either difficult or need a large number of inputs that are not always available from meteorological stations. Over a 6-year period (2006–2011), this study compares Feed Forward Neural Network (FFNN), Radial Basis Function Neural Network (RBFNN), and Gene Expression Programming (GEP) machine learning approaches for estimating daily ET0 in a meteorological station in the Lower Cheliff Plain, northwest Algeria. ET0 was estimated using the FAO-56 Penman–Monteith (FAO56PM) equation and observed meteorological data. The estimated ET0 using FAO56PM was then used as the target output for the machine learning models, while the observed meteorological data were used as the model inputs. Based on the coefficient of determination (R2), root mean square error (RMSE), and Nash–Sutcliffe efficiency (EF), the RBFNN and GEP models showed promising performance. However, the FFNN model performed the best during training (R2 = 0.9903, RMSE = 0.2332, and EF = 0.9902) and testing (R2 = 0.9921, RMSE = 0.2342, and EF = 0.9902) phases in forecasting the Penman–Monteith evapotranspiration.


Introduction
Food systems are under pressure to boost yields due to rising global food demand despite water resource constraints. As a result, there is a need to shift to more sustainable farming techniques and optimized operations that allow for more efficient use of water resources [1]. Appropriate irrigation management, which is dependent on accurate predictions of crop water requirements, is a critical component of efficient agricultural techniques [2]. Evapotranspiration (ET) is a measure of crop water requirements that includes the transport of vapor water from the land to the atmosphere by evaporation from the soil and transpiration from the plants [3]. ET is one of the most important components of the hydrological cycle and global climate system [4,5]. Accurate estimation of ET is necessary for water resource management, irrigation planning, watershed management,

Description of Data
The meteorological data include daily observations of maximum, minimum, and mean air temperatures (Tmax, Tmin, and Tmean), daily mean relative humidity (RH), wind speed (WS), sunshine duration (SD), and global radiation (GR). The days with data that proved to be inadequate were excluded from the patterns. The statistical parameters

Evapotranspiration Estimation Method
The FAO-56 Penman-Monteith method to calculate ET 0 was implemented following the formulation in [3] as a function of daily mean net radiation, temperature, water vapor pressure, and wind speed. The procedure used was that outlined in Chapter 3 of FAO-56 [3].
where ET 0 is the reference crop evapotranspiration (mm day −1 ), R n is the net radiation (MJ m −2 day −1 ), G is the soil heat flux (MJ m −2 day −1 ), c is the psychrometric constant (kPaC −1 ), e s is the pressure of saturation vapor (kPa), e a is the pressure of the actual vapor (kPa), D is the slope of the curve for saturation vapor pressure-temperature (kPaC −1 ), T a is the average daily air temperature ( • C), and U 2 is the mean daily wind speed at 2 m (m s −1 ).

Multilayer Perceptron Artificial Neural Network
ANNs are non-linear mathematical models based on ideas about the behavior of biological neural networks. An ANN consists of layers of interconnected nodes or neurons. Each neuron gets a linear combination of the previous neuron's outputs (∑w ij x j ), or (for the first layer) of the network inputs and returns a non-linear transformation of this quantity.
The weights (w ij ) are the parameters added to each source defining this linear combination and typically also include an intercept term called the activation threshold [35]. A non-linear activation function is then applied to the linear output combination (f (∑w ij x j )).
This activation function can be, for example, a sigmoid function, which constrains each neuron's output values between two asymptotes. Once the activation function is applied, each neuron's output feeds into the outputs of the next layer. The most frequently used architecture for an ANN consists of an input layer in which the data is introduced into the ANN, a hidden layer(s) in which the data undergoes processing, and the output layer in which the effects of the input generate a predicted output value(s) [35].
The literature contains many kinds of neural networks that have been put to many uses. The Multilayer Perceptron (MLP) is a commonly used ANN configuration utilized regularly in the hydrological modeling field [36,37] (Figure 2). This study assesses the usefulness of neural MLP networks for the estimation of EP. The MLP is the most frequently used and simplest neural network architecture [38].

Radial Basis Function
Another architecture that is used commonly in ANN is the RBF. Multilayer and feed-forward RBF is often used for multi-dimensional spatial interpolation. The word "feed-forward" means the neurons in a layered neural network are arranged in layers [39]. The underlying architecture of a neural network with three layers is presented in Figure 3, with one hidden layer between input and output layers. The activation function of each neuron has the form of an RBF, generating a response only if the inputs are close to some central value determined for that particular neuron.

Gene Expression Programming
While ANNs are complicated models that typically do not capture the physical relationships between different process components understandably, GEP models can express the relationship between dependent and independent variables explicitly [40]. The procedure for modeling daily evapotranspiration (considered to be the dependent variable) based on weather variables (considered as the independent variables) involves the following: selecting the fitness function; selecting terminals T and set of functions F for creating chromosomes; selecting chromosome architecture, and selecting the link function and genetic operators (Figure 4) [35].

Radial Basis Function
Another architecture that is used commonly in ANN is the RBF. Multilayer and feed-forward RBF is often used for multi-dimensional spatial interpolation. The word "feed-forward" means the neurons in a layered neural network are arranged in layers [39]. The underlying architecture of a neural network with three layers is presented in Figure 3, with one hidden layer between input and output layers. The activation function of each neuron has the form of an RBF, generating a response only if the inputs are close to some central value determined for that particular neuron.

Results and Discussion
In this study, firstly, ET0 values were computed by the Penman-Monteith method using climatic data. Then the following equation was used to normalize the input (meteorological data) and output (calculated ET0 by Penman-Monteith):

Evaluation Criteria
The performance of the models utilized in this study was evaluated using standard criteria for statistical performance evaluation. The statistical measures taken into account were coefficient of determination (R 2 ), root mean square error (RMSE), and Nash Sutcliffe efficiency coefficient (EF) [41][42][43]. The calculation of the three criteria was done according to Equations (2)-(4).
where N is the number of observed ET data, ET i(observed) and ET i(model) are observed and model estimations of ET, respectively, and ET mean is the mean of observed ET.

Results and Discussion
In this study, firstly, ET 0 values were computed by the Penman-Monteith method using climatic data. Then the following equation was used to normalize the input (meteorological data) and output (calculated ET 0 by Penman-Monteith): where: X n and X o stand for the normalized and original data, while X min and X max represent the minimum and maximum values in the original data. Approximately 70% of the available data period (from around 2006 to 2010) was selected for the training phase; the remaining 30% belonged to the year 2011 and was used for the testing process. MATLAB was used for the modeling process.

Application of MLP
In this study, the FFNN algorithm was used with a single hidden layer. More details about the parameters used for the FFNN model with one hidden layer are listed in Table 3. With the input data playing a considerable role in model development, several input combinations were used for model development. The performances of all MLP-based input combinations are listed in Table 4 for the training and testing stages. MLP-based model development is a trial and error process. In this study, the tangent sigmoid transfer function was used in the hidden layer, and the linear transfer function was used for the target. To achieve ideal performance with MLP models, the number of neurons in the hidden layer has to be optimized. The results in Table 4 suggest that the FFNN2 model, including T max , T mean , (T max − T min ), RH, I, WS, and GR, performed better than other FFNN-based input combination models with R 2 values as 0.9903, 0.9921, RMSE values as 0.2332, 0.2342, and E values as 0.9902, 0.9902 for both training and testing stages, respectively. Nineteen neurons were used in the hidden layer to achieve this ideal performance. The performance and agreement plot among actual and predicted values of the FFNN2 model for both the training and testing stage are mapped out in Figure 5, which shows that max values lie very close to the line of 450 and follow the same pattern as the actual values in both training and testing stages. If all the values lie on the line of 450 and follow the same path, the model is ideal and predicts values similar to actual ones.

Application of RBF
For the RBF method as well, several input combinations were used for model development. The performance of all input combination-based RBFNN models is listed in Table 5 for the training and testing stages. RBFNN model development is a trial and error process similar to FFNN model development. In this study, the RBF models had a single hidden layer. To achieve ideal performance with RBFNN models, the value of the spread must be found through a trial and error process. The results of Table 5 suggest that the RBFNN5 model, including Tmin, Tmax, Tmean, RH, I, WS, and GR, performs better than other input combination RBFNN based models with R 2 values as 0.9907, 0.9911, RMSE values as 0.2270, 0.2374, and E values as 0.9907, 0.9899 for both training and testing stages, respectively. The performance and agreement plot among actual and predicted values of the RBFNN5 model for both training and testing stages are shown in Figure 6, which shows that max values lie very close to the line of 450 and follow the same pattern as the actual values in both training and testing stages.
The performance evaluation results suggest that the RBFNN5 model performs better than other input combination-based models. On intercomparison among various input combination-based models, the results in Table 5 indicate that the performance of several other models was comparable to the best model (RBFNN5) and involved a lower number of inputs. Overall, the assessment mapped out in Table 5 shows that the RBFNN11 model (Tmean, RH, WS, and GR) is suitable for predicting the ET with R 2 values as 0.9886, 0.9892, RMSE values as 0.2514, 0.2551, and E values as 0.9886, 0.9884 for both training and testing stages, respectively. A lower rate of spread (Table 5) was used in the development of this model than in the case of the RBFNN5 model. The performance and agreement plot among actual and predicted values of the RBFNN11 model for both training and testing stages are shown in Figure 6, which shows that max values lie very close to the line of perfect agreement and follow the same pattern as the actual values in both training and testing stages. Performance evaluation results suggest that the FFNN2 model performed better than other input combination-based models. As to comparing various input combination-based models with one another, the results in Table 4 indicate that several other models are comparable in performance to the best model (FFNN2) while having a lower number of required input meteorological variables. Overall, going with the assessment in Table 4, the FFNN11 model (T mean , RH, WS, and GR) is suitable for predicting ET with R 2 values as 0.9875, 0.9892, RMSE values as 0.2656, 0.2623, and E values as 0.9873, 0.9877 for both training and testing stages, respectively. The same number of neurons (19) is used in the single hidden layer for achieving this performance, similar to the FFNN2 model. The performance and agreement plot among actual and predicted values of the FFNN11 model for both the training and testing stage is shown in Figure 5, which points to the fact that max values lie very close to the line of perfect agreement and follow the same pattern as the actual values in both training and testing stages.

Application of RBF
For the RBF method as well, several input combinations were used for model development. The performance of all input combination-based RBFNN models is listed in Table 5 for the training and testing stages. RBFNN model development is a trial and error process similar to FFNN model development. In this study, the RBF models had a single hidden layer. To achieve ideal performance with RBFNN models, the value of the spread must be found through a trial and error process. The results of Table 5 suggest that the RBFNN5 model, including T min , T max , T mean , RH, I, WS, and GR, performs better than other input combination RBFNN based models with R 2 values as 0.9907, 0.9911, RMSE values as 0.2270, 0.2374, and E values as 0.9907, 0.9899 for both training and testing stages, respectively. The performance and agreement plot among actual and predicted values of the RBFNN5 model for both training and testing stages are shown in Figure 6, which shows that max values lie very close to the line of 450 and follow the same pattern as the actual values in both training and testing stages.

Application of GEP
The details of parameters used in the GEP model are listed in Table 6. The performance of all input combination-based GEP models is listed in Table 7 for the training and testing stages. GEP based model development is also a trial and error process similar to the model development typical of FFNN and RBFNN models. For the performance of The performance evaluation results suggest that the RBFNN5 model performs better than other input combination-based models. On intercomparison among various input combination-based models, the results in Table 5 indicate that the performance of several other models was comparable to the best model (RBFNN5) and involved a lower number of inputs. Overall, the assessment mapped out in Table 5 shows that the RBFNN11 model (T mean , RH, WS, and GR) is suitable for predicting the ET with R 2 values as 0.9886, 0.9892, RMSE values as 0.2514, 0.2551, and E values as 0.9886, 0.9884 for both training and testing stages, respectively. A lower rate of spread (Table 5) was used in the development of this model than in the case of the RBFNN5 model. The performance and agreement plot among actual and predicted values of the RBFNN11 model for both training and testing stages are shown in Figure 6, which shows that max values lie very close to the line of perfect agreement and follow the same pattern as the actual values in both training and testing stages.

Application of GEP
The details of parameters used in the GEP model are listed in Table 6. The performance of all input combination-based GEP models is listed in Table 7 for the training and testing stages. GEP based model development is also a trial and error process similar to the model development typical of FFNN and RBFNN models. For the performance of GEP models under different input combinations, for the training phase, the R 2 ranged between 0.6973 and 0.9664, RMSE ranged 0.4830-1.3112 mm day −1 , and EF ranged 0.6895-0.9579. So, for the test phase, the R 2 ranged between 0.8057-0.9775, RMSE ranged 0.3701-1.1224 mm day −1 , and E ranged 0.7744-0.9755 (Table 7). It is clear that the presence or absence of critical meteorological variables in the input combinations significantly affected GEP model performance. The results of Table 7 suggest that the GEP11 model, including T mean , RH, WS, and GR parameters in the input combination, performed better than other input combinations and GEP based models with R 2 values as 0.9606, 0.9775, RMSE values as 0.4830, 0.3701, and E values as 0.9579, 0.9755 for the training and testing stages, respectively. The performance and agreement plot among actual and predicted values of the GEP11 model for both training and testing stages are shown in Figure 7, which indicates that max values lie very close to the line of 450 and follow the same path as the actual values in both training and testing stages. Table 7 concludes that the GEP11 model is the best performing model with optimum input combinations.    Table 8 shows that the FFNN2 based model works better than the RBFNN and GEP based models. Figure 8 indicates that predicted values using the FFNN2 model lie closer to the line of perfect agreement than the values predicted by the RBFNN and GEP based models.  Figure 8. Scatter plot among observed and predicted values using best input combination-based models using testing data set. Figure 9 displays box plots for prediction errors for the best input combination-based models using the test period. The values of the descriptive statistics of prediction errors for the best input combinations are listed in Table 10. According to Table 10 and Figure 9, the FFNN2 model followed the corresponding observed values with lower minimum error (−0.8840), lower maximum error (1.4199), and the width of the first quartile is less than other best input combination based models.  The overall performance of the FFNN2 based model is reliable and suitable for the prediction of ET 0 . As such, T max , T mean , (T max − T min ), RH, I, WS, and the GR input combination-based FFNN model could be used for the prediction of ET 0 . However, the results mapped out in Table 9 of single-factor ANOVA suggest that there is no significant difference between observed and predicted values using FFNN, RBFNN, and GEP best combination-based models. Table 9. Single-factor ANOVA results for the best combination of inputs.  Figure 9 displays box plots for prediction errors for the best input combination-based models using the test period. The values of the descriptive statistics of prediction errors for the best input combinations are listed in Table 10. According to Table 10 and Figure 9, the FFNN2 model followed the corresponding observed values with lower minimum error (−0.8840), lower maximum error (1.4199), and the width of the first quartile is less than other best input combination based models.  The Taylor diagram of the observed and predicted ET0 by different best input combination-based models over the test period is depicted in Figure 10. It is clear that the representative points of all the applied models have nearly the same position. The FFNN2 model is located nearest to the observed point with the lower value of RMSE and SD and higher value of the coefficient of correlation, which picks out this model as the superior model.  Table 11 proposes that the RBFNN11-based model works better than FFNN and GEP based models. Figure 11 indicates that predicted values using the RBFNN11-based The Taylor diagram of the observed and predicted ET 0 by different best input combination-based models over the test period is depicted in Figure 10. It is clear that the representative points of all the applied models have nearly the same position. The FFNN2 model is located nearest to the observed point with the lower value of RMSE and SD and higher value of the coefficient of correlation, which picks out this model as the superior model. The Taylor diagram of the observed and predicted ET0 by different best input combination-based models over the test period is depicted in Figure 10. It is clear that the representative points of all the applied models have nearly the same position. The FFNN2 model is located nearest to the observed point with the lower value of RMSE and SD and higher value of the coefficient of correlation, which picks out this model as the superior model.  Table 11 proposes that the RBFNN11-based model works better than FFNN and GEP based models. Figure 11 indicates that predicted values using the RBFNN11-based  Table 11 proposes that the RBFNN11-based model works better than FFNN and GEP based models. Figure 11 indicates that predicted values using the RBFNN11-based model lie closer to the line of perfect agreement than the values predicted by the FFNN and GEP based models. The overall performance of the RBFNN11 based model is reliable and suitable for the prediction of ET 0 , which suggests that T mean , RH, WS, and GR input combination-based RBFNN model could be used for the prediction of ET 0 . The results in Table 12 of single-factor ANOVA suggest that there is no significant difference between observed and predicted values using FFNN, RBFNN, and GEP optimum input combinationbased models.  Table 12 of single-factor ANOVA suggest that there is no significant difference between observed and predicted values using FFNN, RBFNN, and GEP optimum input combination-based models. Figure 11. Scatter plot among observed and predicted values using optimum input combination-based models using testing data set.  Figure 12 displays the box plot for the prediction errors for the optimum input combination-based models using the test period. The descriptive statistical values of prediction errors for the optimum input combinations are listed in Table 13. According to Table 13 and Figure 12, the RBFNN11 model has followed the corresponding observed values with lower maximum error (1.3700), and the width of the first quartile (−0.0952) is Figure 11. Scatter plot among observed and predicted values using optimum input combinationbased models using testing data set.  Figure 12 displays the box plot for the prediction errors for the optimum input combination-based models using the test period. The descriptive statistical values of prediction errors for the optimum input combinations are listed in Table 13. According to Table 13 and Figure 12, the RBFNN11 model has followed the corresponding observed values with lower maximum error (1.3700), and the width of the first quartile (−0.0952) is less than other optimum input combination based models.

Source of Variation
The Taylor diagram of the observed and predicted ET 0 by different optimum input combination-based models over the test period is depicted in Figure 13. It is clear that the representative points of all the applied models have nearly the same position. The RBFNN11 model is located nearest to the observed point with the lower value of RMSE, SD, and higher value of the coefficient of correlation, making this model emerge as a superior model with the optimum number of input parameters.  The Taylor diagram of the observed and predicted ET0 by different optimum input combination-based models over the test period is depicted in Figure 13. It is clear that the representative points of all the applied models have nearly the same position. The RBFNN11 model is located nearest to the observed point with the lower value of RMSE, SD, and higher value of the coefficient of correlation, making this model emerge as a superior model with the optimum number of input parameters.

Conclusions
This study aimed to investigate the potential of FFNN, RBFNN, and GEP to estimate

Conclusions
This study aimed to investigate the potential of FFNN, RBFNN, and GEP to estimate daily evapotranspiration in a semi-arid region in Algeria using different combinations of input meteorological variables. The results pointed to the fact that both the neural network (i.e., FFNN and RBFNN) and GEP models make for optimal levels of agreement with the ET 0 obtained by the FAO PM method. They yielded reliable estimations for the semi-arid area in question. The study also found that modeling ET 0 utilizing the ANN technique leads to better estimates than the GEP model.
The current results suggested that the FFNN based model 2 outperformed all other applied models. Another major conclusion was that the RBFNN model 11 performed better than other applied models with a smaller number of required meteorological inputs. ANN and GEP based models suggest that T mean , RH, WS, and GR parameters are the optimum parameters for the estimation of daily evapotranspiration in the semi-arid region of Algeria. The overall performance of all applied models is satisfactory, as there is no significant difference between actual and predicted values using the optimum number of input parameters in the models.