Examination of Single-and Hybrid-Based Metaheuristic Algorithms in ANN Reference Evapotranspiration Estimating

: Hydrological resource management, including crop watering and irrigation scheduling, relies on reliable estimates of reference evapotranspiration (ETo). However, previous studies of forecasting ETo have not dealt with comparing single and hybrid metaheuristic algorithms in much detail. This study aims to assess the efﬁciency of a novel methodology to simulate univariate monthly ETo estimates using an artiﬁcial neural network (ANN) integrated with the hybrid particle swarm optimisation–grey wolf optimiser algorithm (PSOGWO). Several state-of-the-art algorithms, including constriction coefﬁcient-based particle swarm optimisation and chaotic gravitational search algorithms (CPSOCGSA), the slime mould algorithm (SMA), the marine predators algorithm (MPA) and the modiﬁed PSO algorithm were used to evaluate PSOGWO’s prediction accuracy. Monthly meteorological data were collected in Al-Kut City (1990 to 2020) and used for model training, testing and validation. The results indicate that pre-processing techniques can improve raw data quality and may also suggest the best predictors scenario. That said, all models can be considered efﬁcient with acceptable simulation levels. However, the PSOGWO-ANN model slightly outperformed the other techniques based on several statistical tests (e.g., a coefﬁcient of determination of 0.99). The ﬁndings can contribute to better management of water resources in Al-Kut City, an agricultural region that produces wheat in Iraq and is under the stress of climate change.


Introduction 1.Research Background
Water scarcity is a growing issue, especially in arid and semi-arid regions [1], because it endangers the sustainability of food production systems and creates worries around food security [2].The major consumer of freshwater resources worldwide is agriculture [3].By 2050, there will be a 60% increase in the world's food needs [4], meaning that informed and careful management of irrigation practices will be required and could result in substantial water savings [5].Reference evapotranspiration (ETo) is a crucial part of the hydrological cycle; its fluctuations are relevant for managing water resources, irrigation scheduling and agricultural yield [6].The existing literature suggests that the preferred approach for calculating ETo is the FAO Penman-Monteith equation (FAO-56 PM) [7].However, the FAO-56 PM technique is less adaptable and difficult to apply to diverse climate circumstances, primarily arid and semi-arid countries, because it requires many meteorological factors that might not be available, similar to many other empirical models [8].
Iraq's main sources of fresh water are the Tigris and Euphrates rivers.These rivers originate in Turkey, Syria and Iran.Consequently, Iraq's water supplies have been affected by both climate change and human-induced interventions upstream (i.e., unilateral damming, water abstraction and inter-basin water transfer schemes in the upper riparian nations, mainly Turkey and Iran).Therefore, Iraq is confronted with increasing difficulties in managing its water supplies.Droughts that last for months at a time have become all too prevalent in Iraq [9,10].Also, due to rising temperatures, insufficient and decreasing rainfall, lengthy droughts and water scarcity, frequent sand and dust storms and flooding, Iraq has been ranked as the fifth most vulnerable country to climate change and the thirty-ninth most water-stressed [11].

Applied Machine Learning Methods for ETo Forecasting
Irrigation planning can benefit from not only past and present ETo estimations, but also future ETo forecasts [12].Recently, machine learning (ML) methods have been effectively applied to predict ETo in different regions [13,14].On the basis of monthly lagged meteorological data Nourani et al. [1], anticipated ETo up to 3 months in advance.Artificial neural networks (ANN), support vector machines (SVM) and adaptive neuro-fuzzy inference systems (ANFIS) were used singly and multimodally to achieve this.The outputs of separate models were fed into the weighted average (WA), simple average (SA) and neural ensemble (NE) techniques.Monthly meteorological parameters from a number of weather stations in Iraq, North Cyprus and Turkey were employed as input during the creation of the models.Due to their linear behaviour, SA and WA ensembles did not exhibit a substantial increase in performance, but WA was found to be superior to SA. NE performed the best across all stations and modelling processes.The overall outcomes confirmed that the multimodal strategy performed better than single techniques, boosting performance up to 24%, 33% and 60% for stations in Iraq, North Cyprus and Turkey, respectively.
Ferreira and da Cunha [12] worked on predicting daily multi-step (7 days) ETo.They used the potential of traditional ML methods (ANN and random forest (RF)) and deep learning (DL) models (long short-term memory (LSTM), one-dimensional convolutional neural networks (1D CNN)) and a hybrid of the two previous techniques (CNN-LSTM), in local and regional scenarios.Fifty-three automated weather stations in Brazil were used, and their results revealed that DL techniques outperformed ML methods by a small margin, CNN-LSTM2 and CNN-LSTM3 achieving the best results.
Sayyahi et al. [15] simulated daily and monthly ETo with lags of one, two and three days, and one, two and three months, at the Aidoghmoush Basin, Iran.They integrated the multilayer perceptron (MLP) model with three metaheuristic algorithms (MHAs), including water wave optimisation (WWO), genetic algorithm (GA) and particle swarm optimisation (PSO).The MLP-WWO outputs were more in line with the data that was actually observed, outperforming MLP, MLP-PSO and MLP-GA techniques.

Research Significance and Motivation
Traditional ML approaches, such as ANN, SVM and ANFIS, are also applied to estimate ETo.However, complex, non-linear and high-dimensional computing issues make it difficult to find global optimum solutions, and the number of possible solutions may be infinite.It is crucial to find a workable answer in such predicaments [16].Metaheuristic optimisation algorithms (MHAs) have proven to be effective in tackling a wide range of non-linear and difficult problems in various domains of hydrology (i.e., when hybrid models are compared with standalone ML models), such as [17][18][19].These hybrid models are preferable for tackling complex practical problems [20].Because of this, these techniques have demonstrated success in locating novel areas that may increase the breadth of possible solutions [21].Additionally, the capacity for exploitation is enhanced in order to prevent being stuck in local minima.In conclusion, these techniques are superior in their ability to handle nonlinearity, multi-variable and complicated situations [22].
According to the "no free lunch" theorem, there is no optimal solution for resolving all possible optimisation problems [23].Another way of saying this is that there is no such thing as an optimal optimisation procedure that can be applied in all cases.Because of this, scientists in a wide variety of fields are more motivated to design novel algorithms (single-based) or develop hybrid algorithms (hybrid-based) in order to tackle increasingly difficult issues with high dimensionalities [16,24].
Metaheuristic optimisation algorithms have become increasingly popular and are used to develop hybrid models for hydrological research, such as ETo prediction [25].MHAs seek the best viable answer within an optimisation problem [26,27].Amongst these algorithms is the slime mould algorithm (SMA), which was introduced by Li et al. [28].
SMA imitates the slime mould Physarum polycephalum's structural and performance fluctuations whilst it is seeking nourishment [29].It has been used to address various optimisation difficulties, including issues with engineering designs [30] and estimation of the characteristics of solar photovoltaic cells [31].Faramarzi et al. [32] developed the marine predators algorithm (MPA), which draws inspiration from nature and mimics predator behaviour when attacking prey.It has been employed in many fields, such as engineering applications [33] and sources of power [34].The constriction coefficient-based particle swarm optimisation and chaotic gravitational search algorithm (CPSOCGSA), was developed by Rather and Bala [35].It was created by merging the PSO method, which simulates the behaviour of flocks of birds, with the gravitational search algorithm (GSA), as an application of Newton's law of universal gravity.
The above review of the literature demonstrates how optimisation methods can be used to strengthen predictor models [36], such as Tao et al. [37] and Maroufpoor et al. [38].Optimisation algorithms' primary strengths lie in their capacity to quickly and accurately determine the optimal settings of system parameters in response to changing conditions [39].In spite of that, optimisation is still required, particularly in modelling hydrological variables like ETo, because of their stochastic, non-stationary nature and data noise [40].
MHAs primarily consist of two components: exploration and exploitation."Exploration" is the process of determining the limits of an algorithm's search space in computer science, while "exploitation" describes the selection of the optimal solution from the many possible generated [41].A search algorithm needs to detect an optimal balance between exploration and exploitation in order to be effective [42].Nevertheless, exploration and exploitation are inversely proportional to one another [43].Along with creating new MHAs, another strategy is needed to combine the best features of several algorithms to create a superior algorithm.As a result, the hybrid particle swarm optimisation-grey wolf optimiser algorithm (PSOGWO), was developed by Şenel et al. [44].It combines the exploitation capability of the PSO algorithm with the exploration capability of the grey wolf optimiser (GWO).Using the GWO algorithm's exploration capability prevents the PSO algorithm from settling into local minimums, which can reduce the quality of the solution.
Recent research on the hybridisation of hybrid structures for time series forecasting by Hajirahimi and Khashei [45] showed the importance of pre-processing data and optimisation approaches.The hybridisation of hybrid techniques is a novel approach used to increase accuracy where two or more hybrid classes are fused, rather than the traditional combination of separate predicting approaches.One such tactic (HPOH) combines pre-processing-based and parameter optimisation-based hybrid techniques.There are, however, gaps in the research suggestive of future research related to the hybridisation of hybrid models.
Khairan et al. [17]  Examine how the novel HPOH technique simulates monthly ETo, depending on several lags.

Area of Study and Dataset
Iraq, located in the fastest-warming area of the world, faces temperatures up to 54 • C. It is experiencing significant effects of climate change, mainly drought, which results in decreased and altered precipitation patterns, as well as extreme heat waves [46].The capital of Wasit Province, Al-Kut (Figure 1), which lies on the Tigris River in Southeastern Iraq [47], is approximately 40 square kilometres and is considered Iraq's food basket [9].The weather in Al-Kut City is considered normal but has short spring and fall seasons, approximately one month each, chilly in the winter and very dry and hot in the summer [48].The Iraqi Meteorological Service states that the winter season officially starts in November and lasts until early March, the summer season lasting for most of the remaining months of the year, with June, July and August typically being the hottest months.Mean annual precipitation ranges between 150 and 300 mm [49].Due to the water policies of Turkey and Iran, the Tigris River has reduced inflow by approximately two-thirds.Concurrent with changes in the oil industry that have increased the use of water, coupled with other socio-economic factors like the rapid population growth rate of about 2.5% that is taking place in Iraq, there are serious concerns with reference to the management of water resources [46].
The lack of meteorological data is a fundamental issue in developing nations.Due to exceptional circumstances, i.e., wars and terrorism in Iraq, most of the data from 1990-2020 are missing.Secondary (i.e., satellite) data offered by the National Aeronautics and Space Administration (NASA: Alawsi et al. [9] and Capt et al. [50]) were consequently used to carry out this study.These meteorological data were collected once a month over 30 years, between 1 January 1990 and 30 December 2020, comprising wind speed at the height of 2 m (U2) (m/s), relative humidity (RH) (percent), dew point temperature (Tdew) ( • C), minimum temperature (Tmin) ( • C), maximum temperature (Tmax) ( • C) and solar radiation (Rs) (MJ/m 2 /day).The meteorological parameters are described statistically in Table 1.

FAO-56 PM Approach
The FAO56-PM equation (Equation ( 1)) is applied to calculate ETo data, which are used to develop and assess the models under consideration.This equation was used because of its widespread application and good levels of performance [12].
where monthly ETo is measured by mm/day, Rn the net surface radiation (MJm −2 day −1 ), G the soil heat flux (MJm −2 day −1 ), Tave the average monthly temperature ( • C), es the mean saturation vapour pressure (kPa), ea the actual vapour pressure (kPa), ∆ the slope of vapour pressure function (kPa/ • C) and γ the psychometric constant (kPa/ • C).The FAO Irrigation and Drainage Paper No. 56 fully explains the method of calculation and ETo principles [51].

Methodology
This research suggests the following four techniques (Figure 2) for estimating ETo on a monthly basis, taking previous ETo data into account (i.e., univariate procedure).It consists of four stages: (I) data pre-processing techniques, (II) PSOGWO algorithm, (III) ANN model and (IV) model performance assessment.Detailed descriptions of different measures are provided below.

Data Pre-Processing Techniques
Before the training process, the data are normalised using a natural logarithm approach to make the data more static, reduce collinearity between predictors and increase the model's precision and speed of fit [52,53].The SSA approach is then used to denoise the normalised data.SSA is a strong method for data analysis applied to find important predictive properties for both linear and nonlinear time series [54].It consists of two stages: first, the noise is eliminated, and then a new time series is reconstructed with less noise, after decomposing the original time series into various principal components (PCs), including trend, oscillatory components and irregular components [55].By examining these elements, conclusions can be drawn about the properties of the original time series [56].Mathematically, SSA can decompose a time series of n points into a number of components that are no more than half of n+1.Consequently, an analysis can begin with three PCs, gradually increasing the number until the best result is achieved [57].Singular spectrum analysis is a useful method for separating a time series into its constituent principal components (PCs).The first PC has the highest value, and the last PC has the lowest value in explaining the variance of the original time series.By focusing on the PCs that contribute the most variance and ignoring the PCs that contribute the least, SSA can be used to remove the structureless noise present in a time series [58].This method is beneficial in a number of domains, including drought estimating [59], hydrology [60] and stream flow forecasting [61].

Data Pre-Processing Techniques
Before the training process, the data are normalised using a natural logarithm approach to make the data more static, reduce collinearity between predictors and increase the model's precision and speed of fit [52,53].The SSA approach is then used to denoise Predictive ML models can be made more accurate with the use of an appropriate sequence of temporal delays (lags).On the other hand, models with insufficient or unnecessary lags may be inaccurate or overly complicated.Because of the need to assume linearity or heavy redundancy between lags, many optimal lag selection approaches underperform.To address the collinearity issue, this research also considered the average mutual information (AMI), which may be considered a nonlinear generalisation of the autocorrelation function [62].The MI method is used in this research to select the best explanatory factors.It calculates the statistical correlation between the target and lag data, thereby allowing the choice of the highest correlation components with the most MI [63].The SPSS 24 statistics package was used to implement the normalisation and select the best predictor stages, while the MATLAB (version 2023b) toolbox was applied to perform the SSA algorithm.

Hybrid Particle Swarm Optimisation-Grey Wolf Optimiser Algorithm (PSOGWO)
GWO is a MHAs created by Mirjalili et al. [64] and is based on the social leadership and chasing behaviour of grey wolves in the wild.The group of grey wolves is divided into four sub-groups: alpha (α), beta (β), delta (δ) and omega (ω) to represent the social hierarchy of the grey wolf.As they came across prey, they concentrated close to it, surrounded it on all sides, eventually catching it [65].The process of encircling prey is given as: where T represents the number of immediate iterations, X p is the location of the prey, X is the location of the grey wolves and A and C are the vector coefficients.The coefficients for A and C are as follows: Equations ( 6)- (11) show how grey wolves can adjust their position based on the whereabouts of α, β and δ wolves.
Equations ( 9)-( 11) are predicted to have the following typical results: where X (t+1) is the position for the next iteration.
Kennedy and Eberhart [66] created the MHA known as particle swarm optimisation (PSO).Its commendable performance in numerous engineering applications has contributed significantly to its widespread acclaim.As a result, it has been applied in many domains to identify the best solution, such as in ETo forecasting [52,[67][68][69], intelligent agriculture [70], stream flow [71] and drought [72].It was inspired by the natural propensity of animals to congregate in large groups, for example, flocks of birds.Flocking is interpreted by the algorithm as a foraging behaviour.In the first phase, the search domain is populated with random individuals.The best known location and the current location of each particle in the population are used to identify them.At the end of each iteration, the particles' positions are updated by the velocity factor, which is defined as: where i refers to the particle in the swarm.The number of iterations is indicated by n, while the values r 1 and r 2 , represent random numbers between 0 and 1. ω is the inertia weight parameter, where its value is given as 0.5+rand/2, C 1 and C 2 are set to be 0.5, x the position, v the velocity and p i the best position information that the i particle has attained.p g represents the best position information accessible in the swarm, the coefficients C 1 and C 2 indicating the optimisation parameters.The new position and velocity of the particles with smaller possibilities are disregarded in favour of a random position inside the search space in order to prevent local minima.The procedure continues until the ideal outcome is achieved or the exhaustion of the predetermined maximum number of iterations.
The hybrid PSOGWO algorithm (Figure 3) was created by Şenel et al., [44], with both the PSO and GWO algorithms retaining their original characteristics in their entirety.The PSO algorithm is powerful enough to uncover optimal solutions to practically all complicated issues.However, there is a need to reduce the chance for PSO to get trapped in local optimal solutions.GWO supports the swarm algorithm to reduce the probability of encountering a local minimum.The exploration capability has been applied to decrease the possibility of swarms being drawn to local minima, achieving this by moving some of the candidates to areas where GWO has made improvements.Şenel et al. [44] describe the hybrid optimisation method using Algorithm 1, as explained in the next section.PSO is used to start the process, and when a random integer is reached that is smaller than the selected possibility rate, the process switches to GWO.The adjusted position is then updated after the GWO algorithm has been run.The transition from PSO to GWO ensures that solutions do not enter local minima.The procedure then transitions to PSO before ending when the number of iterations is achieved.

9
Update PSO position 10 end 11 end 12 end 13 end PSO is used to start the process, and when a random integer is reached that is smaller than the selected possibility rate, the process switches to GWO.The adjusted position is then updated after the GWO algorithm has been run.The transition from PSO to GWO ensures that solutions do not enter local minima.The procedure then transitions to PSO before ending when the number of iterations is achieved.

Artificial Neural Network (ANN)
ANNs are one of the ML techniques that simulate how the human brain learns through experience [73].ANN is a powerful computational technique to use with non-linear systems [74].It can communicate between process inputs and outputs without completely understanding its physics because it is based on the simultaneous operations of biological nervous systems [38].A common neural network layout is the multilayer perceptron network (MLP) [75].MLP has four layers: the input layer, which contains the computed ETo using FAO-56 PM; the hidden layers, which include two tansigmoidal activation functions to handle complex nonlinearity, and the output layer, which contains the forecasted ETo.However, adopting a time-intensive trial-and-error method is not necessarily the best option, so MHAs are combined with ANN to select the optimal learning rate (Lr) and the number of neurons (N1 and N2) for the hidden layers [76].Hybrid techniques have been shown to increase the performance of ANN, producing optimal input/output mapping while saving time.

ANN-Based MHAs
In the field of hydrology, ANN is currently the most popular ML model, and it has been shown to be superior to SVR and genetic programming (GP) approaches for estimating ETo [4].However, the ANN's Lr coefficient and the N1 and N2 are crucial design hyperparameters, which map the relationship linking predictors and target factors and decrease the error.Using a trial-and-error procedure to establish these hyperparameters can be risky because of the high computational complexity and potential for mistakes [77].Also, it is possible to overfit the data if a high number of neurons are added to the hidden layer using the trial-and-error procedure [78].
In order to improve upon the shortcomings of the already available single models, academics must integrate multiple strategies in order to attain higher precision, more excellent stability and significant dependability [79].Hence, MHAs are applied to tune the ML models to reduce the essential time for implementation in the training phase and avoid trapping in local minima [80].
Selecting an effective MHA is complex and presents additional difficulties, requiring different MHAs-namely, PSOGWO and CPSOCGSA (i.e., hybrid-based) and MPA, SMA and MPSO (i.e., single-based).Each hybrid algorithm was run through 200 iterations with one of five population sizes (10, 20, 30, 40 and 50 popsize), and each swarm was repeated five times to determine which provided the lowest value of fitness function (RMSE).This technique helps to increase the capability of the ANN model to generalise unseen data in the validation stage.

Model Performance Assessment
To assess the error and consistency between the predicted results and calculated values, the following metrics are used: root mean squared error (RMSE), scatter index (SI), Nash-Sutcliffe model efficiency (NSE) and coefficient of determination (R 2 ).The model's precision and the quality of fit increases as the R 2 number gets closer to 1 [65].However, R 2 only examines linear relationships between simulated and computed ETo values [81,82].Therefore, further statistical metrics are required to assess the accuracy of the modelling.RMSE illustrates the average magnitude of error by weighting significant errors more heavily [83].It also shows the deviation between calculated and predicted values [84].Model deviance decreases as RMSE decreases [85].The Scatter Index (SI) compares model performance qualitatively: excellent, good, fair or poor.The model's accuracy is excellent when SI is less than 10%, good when it is between 10% and 20%, fair when it is between 20% and 30%, and poor when it is greater than 30% [13].Finally, NSE compares the ratios of model errors and observed data variance to the ideal value of unity.The NSE index has been developed and used in many hydrological investigations [86].Its value, as established by Pan et al. [87], ranges from 1 to −∞, NSE is a good indicator of model performance when it is close to 1.The following are the formulas required, defined in Equations ( 15)-( 18): where Ri is the monthly ETo determined by the standard FAO-56 PM formula, Pi the predicted monthly ETo as determined by the prediction models used, Ri the mean of the calculated monthly ETo, Pi the mean of the predicted monthly ETo and N the data length.In addition, graphical plots, i.e., Bland-Altman, have been applied to examine the predictive performance of the suggested method.

Data Pre-Processing Analysis
Step A from Figure 1 is represented here.The natural logarithm is used to normalise the data, the normalised time series is then analysed into 12 components using SSA to obtain noise-free, ETo, time series data.After denoising, the first component represents the time series with the highest value.The normalised time series and the first four components of the ETo parameters are shown in Figure 4. Pre-processing of data improves the correlation coefficients (CCs) between the ETo (target) and model inputs (lags).For example, the CC of Lag 1's raw data is increased from 0.83 to 0.99.The CCs for the first four lags in the denoised data were 0.99, 0.96, 0.91 and 0.85, respectively.The MI method (shown in Figure 5) was used to determine the best possible input scenario for the model (univariate technique).According to Stergiou [88], the time lag is selected as the first minimum of average mutual information (AMI).Based on the AMI figure, four lags (Lag 1-Lag 4) of the monthly ETo values are used to simulate future monthly ETos.The MI method (shown in Figure 5) was used to determine the best possible input scenario for the model (univariate technique).According to Stergiou [88], the time lag is selected as the first minimum of average mutual information (AMI).Based on the AMI figure, four lags (Lag 1-Lag 4) of the monthly ETo values are used to simulate future monthly ETos.The MI method (shown in Figure 5) was used to determine the best possible input scenario for the model (univariate technique).According to Stergiou [88], the time lag is selected as the first minimum of average mutual information (AMI).Based on the AMI figure, four lags (Lag 1-Lag 4) of the monthly ETo values are used to simulate future monthly ETos.Tabachnick et al. [89] proposed that a sample size that depends on the number of predictors (m), as demonstrated by Equation ( 19), to decide the suitable sample size (N), is required to construct an adequate model.The sample size in this study is 372, which exceeds the required N of 136.

ANN Technique Configuration
Training, testing and validation datasets are created after data pre-processing techniques are applied.To construct a reliable ETo prediction model, a systematic configuration of the ANN model, rather than trial and error, is required.Therefore, the optimal values for the ANN model's hyperparameters, Lr, N1 and N2, are determined using a combination of four hybrid methods: PSOGGWO-ANN, CPSOCGSA-ANN, MPA-ANN and SMA-ANN based on the MATLAB toolbox.These hybrid techniques were tested with five different popsizes, 10, 20, 30, 40 and 50.The maximum number of iterations is set to 100 for all algorithms.In order to find the best possible solution, it can run each algorithm's popsize five times, i.e., for the PSOGWO-ANN algorithm shown in Figure 6, the best popsizes are 10-4, 20-4, 30-3, 40-2 and 50-2.
Table 2 shows the optimal ANN hyper-parameters determined from the four suggested hybrid algorithms.values for the ANN model's hyperparameters, Lr, N1 and N2, are determined using a combination of four hybrid methods: PSOGGWO-ANN, CPSOCGSA-ANN, MPA-ANN and SMA-ANN based on the MATLAB toolbox.These hybrid techniques were tested with five different popsizes, 10, 20, 30, 40 and 50.The maximum number of iterations is set to 100 for all algorithms.In order to find the best possible solution, it can run each algorithm's popsize five times, i.e., for the PSOGWO-ANN algorithm shown in Figure 6, the best popsizes are 10-4, 20-4, 30-3, 40-2 and 50-2.

Performance Evaluation
After integrating the ANN technique by locating the optimal parameters, the model was run repeatedly (using MATLAB Neural Network Toolbox) to find the best network, weights and biases, to accurately predict the monthly ETo.A battery of statistical tests was conducted to determine whether or not ANN could generalise ETo data by considering four lags in the validation phase and to compare the effectiveness of the established strategies.Table 3 displays the outcomes of four tests applied to assess the efficiency of the models.NSE and R 2 evaluate the linear dependence between measured and simulated ETo data, while RMSE and SI evaluate the nonlinear dependence.As stated in the Section 3.5, ANN models provide good to excellent accuracy.Nevertheless, ANN combined with the PSOGWO algorithm slightly outperforms the other models, which have the lowest RMSE and SI and the highest NSE and R 2 .
The best popsize is then chosen for each algorithm so that it can be compared to other popsizes for the same algorithm (Figure 7).The best popsizes are shown in the figure to be 50-2 for MPA-ANN (RMSE = 0.00128 after 141 iterations), 40-2 for SMA-ANN (RMSE = 0.0013, after 96 iterations), 40-2 for PSOGWO-ANN (RMSE = 0.001338, after 40 iterations), 50-5 for CPSOCGSA-ANN (RMSE = 0.001343, after 18 iterations) and 50-3 for MPSO-ANN (RMSE = 0.001291, after 188 iterations).Table 2 shows the optimal ANN hyper-parameters determined from the four suggested hybrid algorithms.Figure 8 presents the R 2 values for all the suggested hybrid models.The results of all models demonstrate good simulation levels for the ETo time series based on R 2 , according to Dawson et al. [90].R 2 values offer information about the linear relationship between the measured ETo value (i.e., computed by FAO56-PM, target) and predicted ETo value (output), for all models.The graphs demonstrate excellent levels of consistency between measured and forecast data, and the absence of any irregular data or distinct pattern trends.
Figure 9 compares the measured and simulated data for the validation phase from the PSOGWO-ANN, CPSOCGSA-ANN, MPA-ANN, SMA-ANN and MPSO-ANN models.The simulated data closely match the observed data pattern (trend + periodicity), along the time series.
Figure 8 presents the R 2 values for all the suggested hybrid models.The results of all models demonstrate good simulation levels for the ETo time series based on R 2 , according to Dawson et al. [90].R 2 values offer information about the linear relationship between the measured ETo value (i.e., computed by FAO56-PM, target) and predicted ETo value (output), for all models.The graphs demonstrate excellent levels of consistency between measured and forecast data, and the absence of any irregular data or distinct pattern trends.The suggested hybrid techniques were further examined to verify the models' power to accurately simulate monthly ETo in Al-Kut City.The Bland-Altman plot (Figure 10) is also taken into account to determine the degree of systematic variance, the scatter of values and whether there is a correlation between the observed and expected error.The results in Figure 10 are significant because the PSOGWO-ANN model has 96% of the data scattered between the red and green boundaries of the acceptable range, while the rest of the models contain 94% of the data scattered between the boundaries.The suggested hybrid techniques were further examined to verify the models' power to accurately simulate monthly ETo in Al-Kut City.The Bland-Altman plot (Figure 10) is also taken into account to determine the degree of systematic variance, the scatter of values and whether there is a correlation between the observed and expected error.The results in Figure 10 are significant because the PSOGWO-ANN model has 96% of the data scattered between the red and green boundaries of the acceptable range, while the rest of the models contain 94% of the data scattered between the boundaries.Regarding all the performance tests, the suggested techniques reveal good to excellent performance in simulating monthly ETo data in the validation phase.Nevertheless, ANN combined with the PSOGWO algorithm slightly outperformed the other models.Regarding all the performance tests, the suggested techniques reveal good to excellent performance in simulating monthly ETo data in the validation phase.Nevertheless, ANN combined with the PSOGWO algorithm slightly outperformed the other models.

Discussion
Three pre-processing methods were applied in this research as the first stage of the suggested methodology: normalisation by natural logarithm, cleaning using the SSA technique and MI to select the best predictors.The first two pre-processing approaches enhance the quality of the raw data, whereby the CC of Lag 1 increased from 0.83 to 0.99.However, the most crucial step in developing any forecast strategy is identifying the number of predictors for the ML techniques.In this study, Figure 5 revealed that the first four lags were selected as the optimum model input by avoiding multicollinearity, which is vital for accurate ETo estimation.This is in line with the findings of Roy et al. [25] and Sayyahi et al. [15], which applied various variable selection techniques to determine the best number of predictors in order to produce the best possible forecast performance.
ML models have multiple disadvantages, such as a low rate of convergence and the difficulty of avoiding local minima.Recent advances in hybrid modelling have improved

Discussion
Three pre-processing methods were applied in this research as the first stage of the suggested methodology: normalisation by natural logarithm, cleaning using the SSA technique and MI to select the best predictors.The first two pre-processing approaches enhance the quality of the raw data, whereby the CC of Lag 1 increased from 0.83 to 0.99.However, the most crucial step in developing any forecast strategy is identifying the number of predictors for the ML techniques.In this study, Figure 5 revealed that the first four lags were selected as the optimum model input by avoiding multicollinearity, which is vital for accurate ETo estimation.This is in line with the findings of Roy et al. [25] and Sayyahi et al. [15], which applied various variable selection techniques to determine the best number of predictors in order to produce the best possible forecast performance.
ML models have multiple disadvantages, such as a low rate of convergence and the difficulty of avoiding local minima.Recent advances in hybrid modelling have improved ML, paving the way for further development of standalone models to increase their accuracy [17].A recent systematic review by Khairan et al. [17] analysed 33 previous studies on ETo data prediction using hybrid models.Thirty-three studies used ML models that integrated with a wide range of MHAs.All these studies, which apply to various areas and time scales, find that hybrid approaches perform better than standalone models.
Based on multiple statistical and graphical tests used to examine the models, as shown in Section 3.5, the results agree that both MHAs, i.e., single and hybrid algorithms, enhance the performance of the standalone ANN method when simulating ETo data.The hybrid MHAs slightly outperform single MHAs with R 2 of 0.99899, 0.99896, 0.99888, 0.99895 and 0.99894 for PSOGWO-ANN, CPSOCGSA-ANN, MPA-ANN, SMA-ANN and MPOS-ANN, respectively.PSOGWO-ANN is better at determining the optimal hyperparameters (Lr, N1 and N2) of the ANN technique and also saves time.These results reflect those of Adnan et al. [40] and Khalilpourazari and Khalilpourazary [91], who also found that hybrid MHAs outperform single MHAs.This finding may assist in better understanding and regulation of water balance processes.Further work could consider if the learning process can be improved by examining alternative tuning parameters for optimisers, and also applying training strategies, such as transfer learning and reinforcement learning.

Conclusions
This research offers a novel hybrid methodology to simulate monthly ETo data by combining an ANN model with the most contemporary PSOGWO algorithm and data preprocessing techniques.Monthly ETos have been forecast using a novel methodology, developed and tested using a number of prior lags.The performance of the PSOGWO method was evaluated and validated using the CPSOCGSA, MPA, SMA and MPSO-ANN algorithms; the following conclusions were drawn from this investigation in light of the findings:

•
The data pre-processing procedures used in the present study, i.e., the use of SSA and MI, are essential for improving the quality of raw data and selecting the best lagged scenario, whereby the CC of Lag 1 increased from 0.83 to 0.99.

•
ANN is an effective tool for predicting evapotranspiration.Integrating it with algorithms improves its performance and saves time by selecting the optimal Lr coefficient and N1 and N2 numbers.

•
All the forecasting models gave a good and similar performance, but it has been demonstrated that PSOGWO-ANN slightly outperformed other hybrid models.The best model shows that the suggested methodology is an accurate strategy for predicting monthly ETo, with R 2 = 0.99, RMSE = 0.00151, SI = 0.08317 and NSE = 0.99896.
Given the importance wheat planting in Al-Kut City, this research may be useful in advising policymakers on how to allocate water supplies.Future research into the role of MHAs and data pre-processing approaches would be beneficial for researchers to improve HPOH strategies for ETo prediction models.The hybrid PSOGWO-ANN model is a promising technique for estimating monthly ETo in other agro-Iraqi provinces such as Kirkuk and Salahaddin and is therefore strongly recommended for this purpose.

23 Figure 1 .
Figure 1.A map showing the location of the research site (Wasit Province).

Figure 1 .
Figure 1.A map showing the location of the research site (Wasit Province).

23 ANNFigure 2 .
Figure 2. A workflow diagram showing the steps involved in univariate ETo simulation.

Figure 2 .
Figure 2. A workflow diagram showing the steps involved in univariate ETo simulation.

23 Figure 4 .
Figure 4. Normalised data and the first four components were acquired using SSA.

Figure 4 .
Figure 4. Normalised data and the first four components were acquired using SSA.

Figure 4 .
Figure 4. Normalised data and the first four components were acquired using SSA.

Figure 5 .
Figure 5. Average mutual information (AMI) as a function of the ETo time series.Figure 5. Average mutual information (AMI) as a function of the ETo time series.

Figure 5 .
Figure 5. Average mutual information (AMI) as a function of the ETo time series.Figure 5. Average mutual information (AMI) as a function of the ETo time series.

Figure 6 .
Figure 6.Performance of the PSOGWO algorithm.Figure 6. Performance of the PSOGWO algorithm.

Figure 6 .
Figure 6.Performance of the PSOGWO algorithm.Figure 6. Performance of the PSOGWO algorithm.

Figure 7 .
Figure 7. Fitness function of suggested hybrid algorithms under five popsizes.

Figure 7 .
Figure 7. Fitness function of suggested hybrid algorithms under five popsizes.

Sustainability 2023 , 23 Figure 8 .
Figure 8. Coefficients of determination for the suggested models.

Figure 9
Figure9compares the measured and simulated data for the validation phase from the PSOGWO-ANN, CPSOCGSA-ANN, MPA-ANN, SMA-ANN and MPSO-ANN models.The simulated data closely match the observed data pattern (trend + periodicity), along the time series.

Figure 8 .
Figure 8. Coefficients of determination for the suggested models.

Figure 9 .
Figure 9. Measured and simulated ETo data comparison for all suggested strategies in the validation phase.

Figure 9 .
Figure 9. Measured and simulated ETo data comparison for all suggested strategies in the validation phase.

Figure 10 .
Figure 10.Bland-Altman plot of the hybrid techniques in the validation stage.

Figure 10 .
Figure 10.Bland-Altman plot of the suggested hybrid techniques in the validation stage.
reviewed the hybrid techniques (ML and MHAs) available to predict ETo.The hybrid techniques have been successfully applied to predicting ETo.Up to this point in time, far too little attention has been paid to hybrid-based MHAs (6%) compared with single-based MHAs.Accordingly, there is still room for development concerning ETo prediction models.

Table 1 .
The statistical parameters of the meteorological time series.
LR denotes the learning rate, N1 and N2 the number of nodes in the first and second hidden layers.

Table 3 .
Performance of the hybrid models for the validation data stage.