A New Approach for Completing Missing Data Series in Pan Evaporation Using Multi-Meteorologic Phenomena

: The most crucial losses in the hydrological cycle occur due to evaporation (EP). As a result, the accurate attainment of this complex phenomenon is critical in studies on irrigation, efﬁciency in the basins, dams, continuous hydrometeorological simulations, ﬂood frequency, and water budget analysis. However, EP data sets are expensive, difﬁcult to sustainably measure, and scarce, also, predictions are challenging tasks due to the wide range of parameters involved in these processes. In this study, the data gaps are ﬁlled with Class A evaporation pan observations through building a new meteorological station during seasons with no gauge measurements available for a three-year time period. These observations demonstrate high correlations with the readings from the Meteorology Airport Station, with a PCC of 0.75. After the continuous EP time series was completed over Kahramanmaras, these values were retrieved non-linearly via an artiﬁcial intelligence model using multi-meteorological parameters. In the study, the simulation performance is evaluated with the help of eight different statistical metrics in addition to graphical representations. The evaluation reveals that, when compared to the other EP functions, using both temperature and wind-driven simulations has the highest correlation (PCC = 0.94) and NSCE (0.87), as well as the lowest bias (P Bias = − 1.65%, MAE = 1.27 mm d − 1 , RMSD = 1.6 mm d − 1 , CRMSE = 24%) relative to the gauge measurements, while they give the opposite results in the solely precipitation-based models (PCC = 0.42, NSCE = 0.17, P Bias = − 6.44%, MAE = 3.58 mm d − 1 , RMSD = 4.2 mm d − 1 , CRMSE = 62%). It has been clearly seen that the temperature parameter is the most essential factor, while precipitation alone may be insufﬁcient in EP predictions; additionally, wind speed and relative humidity would improve the prediction performance in artiﬁcial intelligence techniques.


Introduction
Evaporation (EP) is the phenomenon of water transfer from the earth and water surfaces to the atmosphere and is an important element in water budget analysis, the management of water resources, and the design of hydrometeorological models [1,2].Additionally, this phenomenon can greatly affect the capacities of reservoirs in terms of both natural lakes and dams, the efficiency of basins, irrigation programming, and the size of hydromechanics structures like pump stations in the catchments [3][4][5].EP estimations from lake surfaces, however, are a challenge given the multitude of parameters used in these processes, varying spatiotemporally, and the measurements required to validate these estimates [6].
EP can be determined using water budget analysis, energy balance, mass transfer methods, and empirical equations.In addition, in situ measurements for the direct detection of EP loss can be used in order to obtain better results with the correlation of lake evaporation.One of the most popular measuring instruments used in this manner is the evaporation container, standardized by the United States National Weather Service, and known as a Class A evaporation pan.This pan, which is also recommended by the World Meteorological Organization (WMO) and the International Association of Hydrological Sciences (IAHS), has a wide usage area in terms of different latitudes and altitudes with various climatic conditions [7][8][9].The values obtained from the pan evaporation cannot completely reflect the type of container used, climate, and temperature.Therefore, these measured losses need to be multiplied by a certain correction coefficient, which varies between 0.35 and 0.85, with an average of 0.70 worldwide, according to the Food and Agriculture Organization (FAO).This coefficient is also usually taken as 0.70 in Turkey [10][11][12].
Measurements are handled manually in today's conditions.For this reason, the workload, device budget, calibration costs arising from measurement continuity, and climatic conditions restrict the station data.Thus, several methods and empirical equations have been investigated to obtain EP values due to the data limitations and/or ungauged places [13][14][15][16][17][18][19].For instance, Haile et al. [13] used the Hargreaves empirical method, which required precipitation and the minimum and maximum temperature to estimate the evaporation in their study.The Standardized Evapotranspiration Indexes were applied in the study to examine the magnitudes of the spatiotemporal patterns of drought at the Bilate Basin in Ethiopia.Mobilia and Longobardi [14] applied the Penman-Monteith, Priestley-Taylor, Blaney-Criddle, advection-aridity, the Granger and Gray, and the antecedent precipitation index methods for predicting the evapotranspiration losses.The results reveal that the performances are different from site to site and depend on the vegetation and climate characteristics.Additionally, they emphasize that empirical methods require numerous data sets and the importance of model calibration.Raza et al. [15] highlighted the requirement of evaporation and evapotranspiration for crop water estimation and budgeting irrigation scheduling in their study.They also indicated that the standard methods for evaporation estimates require various data sets which may not be applicable in developing countries.Therefore, the multi-layer perceptron, gene expression programming, and radial basis function were utilized to estimate these values.Their results show that the multi-layer perceptron can be considered as an alternative method on a monthly basis.The Penman and Linacre models were applied in order to eliminate some problems in building and running a Class A evaporation pan over Samsun, in Turkey, by Sezer and Oztekin [17].They found that the predictions are generally overestimated for low evaporation values while the Penman performance is better than the Linacre's.Singh and Xu [18] proposed thirteen equations based on the mass-transfer method for determining evaporation in a monthly basis over Ontario, in Canada.The results show that the effect of the wind velocity on the monthly evaporation was marginal [18].In another example, the performance of the Hargreaves-Samani and Blaney-Criddle equations and the four types of modified studies of these two equations were investigated during 6 years in 11 stations over Van, Turkey, by Uzunlar et al. [19].The evapotranspiration calculations were assessed with daily meteorological data and the FAO Penman-Monteith equation was used as a reference in the study.The Hargreaves-Samani and its modified equations gave acceptable results when all the results were examined.Gumus et al. [20] estimated the pan evaporation values by using artificial neural networks (ANN), adaptive network-based fuzzy logic inference system (ANFIS), and gene expression programming (GEP) in the Sanliurfa and Diyarbakir regions in Turkey.For this purpose, simulations were carried out with three programming methods based on the average temperature values.It was observed that the all-former methods provided successful results, although the performance was better in the GEP method in the simulations.In a similar manner, Sezer et al. [21] estimated the daily and monthly evaporation losses for class A pans in Samsun with the Kohler-Nordenson-Fox (KNF) and Christiansen models during the 2012-2013 time period.The results show that the EP values were acquired using the KNF model with a higher performance and the Christiansen model overestimated the evaporation.However, when it is multiplied by a certain reduction coefficient, realistic results can be obtained with the latter model over the region.
EP estimates are also used today to increase the reliability of hydrometeorological datasets and/or hydrological simulations.Droogers and Allen [22] estimated evapotranspiration on a global scale using a high-resolution monthly climate dataset.They concluded that the modified Hargreaves method of including a rainfall term improved the evapo-transpiration estimates significantly for arid regions.The study emphasizes that reducing the errors in meteorological observations, like evaporation, requires accurate data sets.Additionally, knowing the rate of evaporation from surface water resources such as channels and reservoirs is essential for the precise management of the water balance, and the Penman-Monteith-Unsworth and Penman-Brutsaert models were used to assess this issue with respect to class A pan measurements [23].Yates and Strzebek [24] used various evapotranspiration methods and analyzed their effects on the river basin runoff.Four river basins of different sizes and hydro-climatic variability were selected as case studies under climate change impact.They concluded that the magnitude and temporal distribution of the evapotranspiration amount estimates are important for the assessment of water resource systems.Evaporation from the surface of water is a vital component of the hydrological cycle, soil hydraulics, solute transport parameters, and the modeling of hydrological processes [25,26].For instance, Nalcioglu et al. [27] analyzed the hydrometeorological data sets over the Asi Basin in Turkey during the years 1962-2011 via the DataFit program.In the analysis, the relationship between the hydrometeorological parameters affecting the Asi Basin and the flow was modeled.As a result of the analysis, the precipitation-evaporation-runoff model was proposed when the highest correlations and insignificant error rates were examined.As a result of the model, they observed the successful fit of the regression analysis.Ercan et al. [28] found a formulation that can be used to predict the flow rate of the Kahramanmaras Aksu River with the regression analysis program (DataFit).In the simulation, where precipitation, evaporation, and wind speed are examined as input parameters, the flow rate is used as the output parameter.Multiple statistical indexes (i.e., mean square error, root mean square error, determination of coefficient, adjusted multiple determination coefficient, etc.) were applied to signify the model performance.The parameters were calculated for each method and equations were evaluated for each model.The results show that the estimated values captured observed data sets.
In this study, the EP was obtained with an artificial intelligence technique using various meteorological data sets.The program, which consists of graphs and equations, creates various models to run the parameter values of the study area in harmony.The aim is to determine the most suitable model among all simulations.Datafit is a program that simplifies the regression analysis, statistical analysis, and graphing.The regression models are automatically sorted according to the conformity of the specified criteria while solving.Kahramanmaras has various water sources such as springs, rivers, dams, and lakes.In the area, the temperature is the dominant meteorological factor during the majority of the year and reaches its highest values during the summer.It has become a necessity to estimate the EP, which is an important component of the management of existing water resources, the amount of water loss, and the hydrological cycle.However, the Turkish State Meteorological Service measures the daily evaporated losses during only six months (April-October) in a year over solely the Kahramanmaraş city center.For this aim, firstly, the data gaps were filled with Class A evaporation pan observations in the seasons that have no gauge measurements over approximately three years.Then, in this study, the prediction of evaporation in the province was investigated non-linearly with the different meteorological data sets.The meteorological-driven EP results were compared with completed pan evaporation measurements.This article is organized as follows: Section 2 provides a description of the field of study, data sets, materials, and method, while Section 3 describes the findings of the analysis, and expresses the results and discussions; the conclusions and recommendations are presented in Section 4.

Materials and Methods
The simulated evaporation was attained using four different climatological phenomena via an artificial intelligence approach.Additionally, Class A evaporation pan measurements were obtained during the period April-October from the Turkish State Meteorological Service.Data gaps were filled with another Class A evaporation pan observation by building a new meteorological station.Generated evaporation values were compared relative to the completed Class A measurements.Then, error propagation was explained with eight various statistics and graphs (Figure 1).

Materials and Methods
The simulated evaporation was attained using four different climatological phenomena via an artificial intelligence approach.Additionally, Class A evaporation pan measurements were obtained during the period April-October from the Turkish State Meteorological Service.Data gaps were filled with another Class A evaporation pan observation by building a new meteorological station.Generated evaporation values were compared relative to the completed Class A measurements.Then, error propagation was explained with eight various statistics and graphs (Figure 1).

Model
Artificial intelligence-based models are increasingly being employed widely in almost every single field with developing technology.For example, it has been demonstrated that there are important contributions in modeling the steel connections of seismic performance used in civil engineering with artificial intelligence [29][30][31].As another example, Guerra et al. [32] performed a comparative study of swarm intelligence metaheuristics in the unscented Kalman filter-based neural training applied to the identification and control of robotic manipulators.Islam et al. [33] conducted a study related to convolutional neural networks based on transfer learning models using data augmentation and transformation for the detection of concrete cracks.The findings show that transfer learning-based techniques are more useful for various detection tasks as well as for identifying fractures in concrete buildings.Ozbek et al. [34] predicted the uniaxial compressive strength of rocks with different characteristics by using genetic expression programming, and agreement was observed between the experimental and artificial-based estimated data sets.An artificial intelligence approach was applied using a qualitative data extraction framework for the agent-based model by Paudel and Ligmann-Zielinska [35].They performed the alternative approach by developing a conceptual agent-based model of household food security in rural Mali.The analysis managed to reduce the subjectivity and biases by limiting the data extraction manipulation.In another study, precipitationbased runoff estimations were assessed via logical synthesis using character linear

Model
Artificial intelligence-based models are increasingly being employed widely in almost every single field with developing technology.For example, it has been demonstrated that there are important contributions in modeling the steel connections of seismic performance used in civil engineering with artificial intelligence [29][30][31].As another example, Guerra et al. [32] performed a comparative study of swarm intelligence metaheuristics in the unscented Kalman filter-based neural training applied to the identification and control of robotic manipulators.Islam et al. [33] conducted a study related to convolutional neural networks based on transfer learning models using data augmentation and transformation for the detection of concrete cracks.The findings show that transfer learning-based techniques are more useful for various detection tasks as well as for identifying fractures in concrete buildings.Ozbek et al. [34] predicted the uniaxial compressive strength of rocks with different characteristics by using genetic expression programming, and agreement was observed between the experimental and artificial-based estimated data sets.An artificial intelligence approach was applied using a qualitative data extraction framework for the agent-based model by Paudel and Ligmann-Zielinska [35].They performed the alternative approach by developing a conceptual agent-based model of household food security in rural Mali.The analysis managed to reduce the subjectivity and biases by limiting the data extraction manipulation.In another study, precipitation-based runoff estimations were assessed via logical synthesis using character linear chromosomes composed of genes structurally organized in a head and a tail [36].Their results illustrate that the artificial model has the ability to capture runoff using only precipitation without topographic and hydrological data for the rainfall events over a sub-tropical urbanized catchment.These types of simulations are typically based on regression analysis and generating unknown parameters.Regression analysis helps to measure the relationship between multiple variables, fill in the gaps between values, predict data-scarce situations, and generate extrapolated datasets for the future.The regression curve is simulated as close as possible to the points it represents.Therefore, the distance of each point to the curve is calculated, and the regression line which makes the total distance the smallest is approved [37,38].When drawing the regression curve, the square distance or absolute value between the predicted and generated points on the line and the actual measured data is calculated.The determined regression model tries to find the least error and make it suitable.
The regression analysis is divided into two single and multiple regressions according to the number of variables.The equation for a line representing the relationship between the dependent and independent variables is formulated by univariate regression.Multivariate regression analysis is defined as regression models with one dependent variable and more than one independent variable.In the analysis model, two main variables are named as the independent variable (X) and dependent variable (Y).From variables, arguments (X) can be multiplied according to the parameters used.The goodness of fit (R 2 ) is a statistic that reflects the predictive power of the equation, in addition to measuring the success of the regression equation.
In this study, Datafit, an easy-to-use program for regression analysis, was used.The regression analysis' basic method entails grouping the observed datasets into three categories (Equation ( 1)): Total Sum of Squares (TSS), Residual Sum of Squares (RSS), and Explained Sum of Squares (ESS) [39].The terms to the left of the equation represent the TSS while those to the right symbolize the RSS and ESS, respectively.For a successful regression equation, the ESS should be large or the RSS should be small.Equation ( 2) is obtained by dividing both sides of Equation ( 1) by the TSS.
where y stands for a data set while y ave represents the average of the data series of y. yî s its by-product based on regression analysis and i symbolizes the step of the data point used.i equals 1, 2, . .., n, and n is the total number of data points.The ESS/TSS ratio of the variability is explained by regression equation (R 2 ).Accordingly, the value of the ratio can be used as a measure of the success of the regression equation (Equation ( 3)).While the R 2 ranges from zero to one, the value indicates that the model performance improves as it approaches one, when perfect harmony is attained.
Hydrologists, hydrometeorologists, and hydrogeologists frequently employ statistical criteria to evaluate how closely simulated behaviors match measurements [40][41][42][43][44][45][46].The following statistical indices were used in this study for the evaluation of the model performance.The Nash-Sutcliffe Coefficient of Efficiency (NSCE) gives an indication of the predictive skill of the model and can be calculated with Equation (4).In comparison to the NSCE alone, the Normalized Nash-Sutcliffe Coefficient of Efficiency (NNSE) gives a more accurate indicator of the model performance by accounting for the variability in the measurements (Equation ( 5)).Equation (6) helps to calculate the centered root mean square error (CRMSE), which quantifies the random error while the root mean square deviation (RMSD) is the standard deviation of the estimation errors.In other words, it indicates how intense the data is around the best-fit line (Equation ( 7)).The Pearson correlation coefficient (PCC) indicates the relationship between the predicted EP values and reference EP values and is calculated using Equation (8).The percent bias (PBias) gives indications on the overall percentage of the mean error magnitude and the error direction (under/overestimation) (Equation ( 9)).Between the estimated and reference EP values, the mean absolute error (MAE) provides an error indicating how close a regression curve is to a set of points in an absolute manner and can be calculated as Equation (10).These error indices are calculated as 8) where EP stands for the evaporation values [mm d −1 ], while EP represents the mean of the EP values.The subscripts r and p indicate the reference and predicted evaporation values, respectively.

Study Area and Data
The study area, Kahramanmaras, with a surface area of 14,502 km 2 , is located in the Mediterranean region in southeast Turkey (Figure 2).The elevation range varies from 130 to 3075 m above the mean sea level with a general elevation gradient dropping from north to southeast.The Ceyhan River, the province's main river, has various streams and dams, and the total length of the river branches is 4085 km.The annual average precipitation is 733.7 mm and the surface and groundwater potentials are 4815 and 343 ha 3 , respectively [47,48].The most significant losses in this huge water mass are undoubtedly EP, which is a complex phenomenon in the hydrological cycle.
Thus, it is critical to obtain accurate EP values in studies such as continuous hydrometeorological simulations, the determination of precise water bodies in dams, flood frequency, water budget, and other hydrological cycle analyses.For this aim, temperature (T), wind speed (W), relative humidity (M), precipitation (P), and evaporation (EP) measurements were attained at S1 Meteorology Airport Station from the Turkish State Meteorological Service (Figure 2).However, unfortunately, the EP datasets are only available in the April-October period.Therefore, another meteorological station (S2) was built to complete the gaps in the data set and obtain a one-year continuous EP time series (Figure 2).
Figure 3 shows a detailed geological map of Kahramanmaras retrieved from Husing et al. [49].Kahramanmaras has a diverse geological composition.The region is part of the Taurus Mountains and has a complex geology with various rock formations, including limestone, dolomite, and marble.This geological diversity can influence the region's hydrogeology.Oligocene bioclastic limestones are located in the margin of the area.According to the map, the base of the hill section consists of nummulitic limestones.These are followed by red conglomeratic sediments that include many basalt layers.The hydrogeological characteristics of Kahramanmaras are influenced by the geological formations in the area.
Groundwater resources can be found in various aquifers, including fractured limestone and alluvial deposits.These aquifers play a crucial role in providing water for both agriculture and domestic use.The area's soil varies depending on the local topography and land use.The region has a range of soil types, including loamy soils, clayey soils, and stony soils.Thus, it is critical to obtain accurate EP values in studies such as continuous hydrometeorological simulations, the determination of precise water bodies in dams, flood frequency, water budget, and other hydrological cycle analyses.For this aim, temperature (T), wind speed (W), relative humidity (M), precipitation (P), and evaporation (EP) measurements were attained at S1 Meteorology Airport Station from the Turkish State Meteorological Service (Figure 2).However, unfortunately, the EP datasets are only available in the April-October period.Therefore, another meteorological station (S2) was built to complete the gaps in the data set and obtain a one-year continuous EP time series (Figure 2).
Figure 3 shows a detailed geological map of Kahramanmaras retrieved from Husing et al. [49].Kahramanmaras has a diverse geological composition.The region is part of the Taurus Mountains and has a complex geology with various rock formations, including limestone, dolomite, and marble.This geological diversity can influence the region's hydrogeology.Oligocene bioclastic limestones are located in the margin of the area.According to the map, the base of the hill section consists of nummulitic limestones.These are followed by red conglomeratic sediments that include many basalt layers.The hydrogeological characteristics of Kahramanmaras are influenced by the geological formations in the area.Groundwater resources can be found in various aquifers, including fractured limestone and alluvial deposits.These aquifers play a crucial role in providing water for both agriculture and domestic use.The area's soil varies depending on the local topography and land use.The region has a range of soil types, including loamy soils, clayey soils, and stony soils.The daily values of the Class A evaporation pan used in this study were measured in the Kahramanmaras Sutcu Imam University (KSU) Civil Engineering meteorology park.The steel pan here has a circular shape of 122 cm in diameter and 25.4 cm in depth, identical to that in station S1.It was placed on an open pier platform, 15 cm above the soil level.The pan is leveled and placed flat on the wooden pallet (Figure 4).The S2 Station (KSU Station) is surrounded by a fence in order to prevent any animal interference, and the measurement readings were carried out at 9:00 am every day.Additionally, a pluviometer and a pluviograph were built to measure the effect of precipitation in the evaporation pan accurately.These calibrated precipitation measurements were read simultaneously with the evaporation values.Additionally, the pan was cleaned on days with frost or snow, but readings could not be taken due to the weather conditions, and these days are excluded from the analysis.The station was built on 1 September 2018 and measurement readings were carried out from 15 October 2018 to 15 March 2020.The readings, unfortunately, after COVID-19 and the devastating earthquake in Kahramanmaras, were not possible due to their harmful consequences.
The daily values of the Class A evaporation pan used in this study were measured in the Kahramanmaras Sutcu Imam University (KSU) Civil Engineering meteorology park.The steel pan here has a circular shape of 122 cm in diameter and 25.4 cm in depth, identical to that in station S1.It was placed on an open pier platform, 15 cm above the soil level.The pan is leveled and placed flat on the wooden pallet (Figure 4).The S2 Station (KSU Station) is surrounded by a fence in order to prevent any animal interference, and the measurement readings were carried out at 9:00 am every day.Additionally, a pluviometer and a pluviograph were built to measure the effect of precipitation in the evaporation pan accurately.These calibrated precipitation measurements were read simultaneously with the evaporation values.Additionally, the pan was cleaned on days with frost or snow, but readings could not be taken due to the weather conditions, and these days are excluded from the analysis.The station was built on 1 September 2018 and measurement readings were carried out from 15 October 2018 to 15 March 2020.The readings, unfortunately, after COVID-19 and the devastating earthquake in Kahramanmaras, were not possible due to their harmful consequences.

Results
The EP values for 101 days were gathered at randomly chosen dates during the 2018 to 2020 data period to compare the reliability of the data from the KSU Station readings with the measurements at the Meteorology Airport Station (Figure 5).In Figure 5, the blue dots and red line represent the evaporation measurement values for these 101 days and the fitted line, respectively.The 45-degree slope is shown with a dashed black line.Scatter plots are used to show the analysis of the EP readings, and the axes represent the EP values in mm d −1 .While PCC values as high as 0.75 are obtained between the aforementioned data sets, it is observed that the KSU station tends to underestimate values greater than 6 mm d −1 .Overall, the figure is examined, and the concentration on that line with a 45-degree slope demonstrates the harmony between the two data sets.Based on the high correlation result and unity gained from Figure 5, then, KSU Station was used and the EP times series was completed.
Table 1 shows the equations involved for each of the 10 non-linear models.The artificial intelligence technique must be calibrated using a sample of the input and output values of the model.Therefore, the residual values must be minimized, while the goodness of fit values must be maximized by simulating the model with thousands of iterations.In Table 1, the first and second column shows the utilized parameters in the simulated model, and their symbols, respectively, while the third column indicates the obtained meteorological-driven evaporation formulas after calibration.These meteorological data sets were used in various combinations to predict EP time series, as shown in the table.In the equations P, T, M, and W represent precipitation (mm m −2 ), temperature ( • C), relative humidity (%), and average wind speed (m s −1 ) on a daily basis, respectively.EP is the abbreviation of the estimated evaporation based on the meteorological parameters, and its unit is mm m −2 .the fi ed line, respectively.The 45-degree slope is shown with a dashed black line.Sca er plots are used to show the analysis of the EP readings, and the axes represent the EP values in mm d −1 .While PCC values as high as 0.75 are obtained between the aforementioned data sets, it is observed that the KSU station tends to underestimate values greater than 6 mm d −1 .Overall, the figure is examined, and the concentration on that line with a 45-degree slope demonstrates the harmony between the two data sets.Based on the high correlation result and unity gained from Figure 5, then, KSU Station was used and the EP times series was completed.Table 1 shows the equations involved for each of the 10 non-linear models.The artificial intelligence technique must be calibrated using a sample of the input and output values of the model.Therefore, the residual values must be minimized, while the goodness of fit values must be maximized by simulating the model with thousands of iterations.In Table 1, the first and second column shows the utilized parameters in the simulated model, and their symbols, respectively, while the third column indicates the obtained meteorological-driven evaporation formulas after calibration.These meteorological data sets were used in various combinations to predict EP time series, as shown in the table.In the equations P, T, M, and W represent precipitation (mm m −2 ), temperature (°C), relative humidity (%), and average wind speed (m s −1 ) on a daily basis, respectively.EP is the abbreviation of the estimated evaporation based on the meteorological parameters, and its unit is mm m −2 .In the box plot in Figure 6, the alternative methods are shown on the horizontal axis of the graph while the evapotranspiration values in mm d −1 , calculated with these approaches, are shown on the vertical axis.The whiskers of the boxplot here show the 5, 25, 50, 75, and 95 quartiles of the data set, respectively.The distribution graph of the EP values during the 3-year data period completed using the data sets at two stations was obtained with the red box plot in the figure and these data sets are used as the reference EP values.When looking at the reference EP time series, it is seen that the values vary from 0.01 to 18.2 and they are concentrated in the range of 2.3 to 10.8 mm d −1 , so at least half of all the ET values are in this range.The best (worst) performance was obtained with the (only precipitation) wind and temperature-driven EP predictions.The skill of capturing the EP measurements, especially if the temperature is used in the prediction as an input parameter, overlaps within the 25th-75th quantile range in all methods.Figure 6 exhibits a tendency to overestimate the reference values for smaller than the 25th quartiles in the EP methods where precipitation is taken into account.Additionally, underestimation is observed at the high values of the EP time series acquired using the F(P), F(P, W), and F(P, M, W) equations, especially at values greater than the 75th quantiles.The formulae do not use temperature, which is assumed to trigger these overestimations.

KSU Station [mm/day]
precipitation) wind and temperature-driven EP predictions.The skill of capturing the EP measurements, especially if the temperature is used in the prediction as an input parameter, overlaps within the 25th-75th quantile range in all methods.Figure 6 exhibits a tendency to overestimate the reference values for smaller than the 25th quartiles in the EP methods where precipitation is taken into account.Additionally, underestimation is observed at the high values of the EP time series acquired using the F(P), F(P, W), and F(P, M, W) equations, especially at values greater than the 75th quantiles.The formulae do not use temperature, which is assumed to trigger these overestimations.As can be clearly seen from Figure 7, the worst result in all the error metrics was attained from the EP values obtained with the F(P) function.Figure 7a shows the distribution of the NSCE and these values vary in the range of 0.17-0.87.NSCE values greater than 0.7 were attained without any exception in all the artificial intelligence models where the effect of temperature was considered, and the best score was obtained with the F(T, W) function.In the functions where precipitation is taken into account in addition to the effect of temperature, a slight negative effect was detected on the NSCE performance values.The convergence of the NNSE value, which is the normalized version of NSCE into the [0, +1] range, to one indicates that the predicted EP values are in perfect agreement with respect to the reference, and Figure 7b represents these variations.As can be seen from the figure, except for models in which the effect of temperature is ignored, the values are clustered between 0.8 and 0.9, and the best value is obtained with temperature and wind speed-driven simulations, similar to the NSCE results.Figure 7c shows the quantification of the random error in terms of the CRMSE and the minimum and maximum values evaluated for the simulations were 24% (F(T, W)) and 62% (F(P)), respectively.In other temperature-based methods, the CRMSE values are around 26%, and this error index reveals that modeling based only on precipitation, wind speed, and relative humidity does not provide sufficient performance.The RMSD values show the square root of the second sample moment errors in the estimated EP values, and these values are presented in mm d −1 in Figure 7d.These errors vary from 1.6 mm d −1 (F(T, W)) to 4.2 mm d −1 (F(P)).This is attributed to the general deviation of the precipitation-driven simulations due to the absence of consideration for temperature.The PCC and goodness of fit values are reported in Figure 7e,f , respectively.The lowest PCC and R 2 values are found for the only precipitation-driven simulations; these amounts are 0.42 (PCC) and 0.17 (R 2 ), respectively.In other simulations, while the performance varies between 0.74 (F(P, W)) and 0.94 (F(T, W)) in terms of the PCC index, it changes from 0.55 (F(P, W)) to 0.87 in terms of the goodness of fit.The fluctuation of the percent bias values for each model relative to the in situ measurements is presented in Figure 7g.As can be seen from the figure, the PBias values vary by a small amount between −1% and 5% in all functions when the rainfall-based model is excluded.The best performance is captured with the F(P, T, W) simulation while the worst one is the F(P) model in terms of the PBias error metric.Finally, the MAE equal to the value of the mean absolute error between the reference and predicted EP values are illustrated in Figure 7h.While the performance of the predictions based on the P and PW-driven functions was the worst in terms of MAE, these values in all the artificial intelligence models reached the small number of 1 after the temperature was taken into account.
W)) in terms of the PCC index, it changes from 0.55 (F(P, W)) to 0.87 in terms of the goodness of fit.The fluctuation of the percent bias values for each model relative to the in situ measurements is presented in Figure 7g.As can be seen from the figure, the PBias values vary by a small amount between −1% and 5% in all functions when the rainfall-based model is excluded.The best performance is captured with the F(P, T, W) simulation while the worst one is the F(P) model in terms of the PBias error metric.Finally, the MAE equal to the value of the mean absolute error between the reference and predicted EP values are illustrated in Figure 7h.While the performance of the predictions based on the P and PWdriven functions was the worst in of MAE, these values in all the artificial intelligence models reached the small number of 1 after the temperature was taken into account.

Discussion
The EP, one of the most vital and complex phenomena of the hydrological cycle, must be investigated under changing climate scenarios due to its critical importance in hydrometeorological simulations, flood analysis, water resource management, agricultural irrigation projects, and basin planning.Unfortunately, EP measurements are not available in half of the one-year period in Turkey and these values vary spatiotemporally.Alternative approaches have been investigated since EP measurements are expensive, complex, and time-consuming, [50][51][52].Within the scope of this study, artificial intelligence techniques were used, which require less data and can be calculated relatively easily, and the use of which has increased with the convenience of developing technology.
The scatter plot graphical representation was used to evaluate the data accuracy between the Meteorology Airport Station and KSU Station.When looking at the picture as a whole, the focus on the line with a 45-degree slope highlights the coherence between the two data sets with a high correlation of 0.75.The KSU station has been shown to consistently underestimate readings more than 6 mm d −1 .The disparities between the two data sets are assumed to be triggered by the distance between the stations, their altitudes, and other meteorological factors such as temperature.As a matter of fact, Li et al. [51] emphasized the importance of elevation and sunshine duration factors in addition to temperature for producing evaporation pan values.In addition to these parameters, there are also studies showing that land use is another important factor for affecting evaporation phenomena [19,52].After obtaining coherent results from the scatter plot evaluation, EP simulations were carried out with an artificial intelligence model using these completed time series as reference.
Kumar et al. [50] claimed artificial intelligence-based pan evaporation forecasts can be obtained easily relative to traditional approaches with enough data sets in terms of duration.Therefore, EP values were simulated using various meteorological data sets for forecasting in a non-linear regression via the Datafit model.Here, the effect of each meteorological parameter on EP is also taken into account one by one.The obtained models show that in the functions where precipitation data is employed, the constants of the precipitation variable have a less effective coefficient than the other variables in the formulae.Ablikim et al. [52] showed in their study the natural factors that had the greatest and least influences on forecasting evaporation belong to temperature and precipitation, respectively, which supports the aforementioned results.The climatic parameter-driven simulation is represented with the box and whisker plots.While the most accurate results were attained with the wind and temperature-driven EP predictions, the worst performance was obtained with only precipitation used as an input parameter for forecasting the EP.Additionally, the analysis indicated that the variability in the EP magnitudes is narrower when including the precipitation-driven methods (i.e., F(P), F(P, W), F(P, M), and F(P, M, W)) compared to the temperature-driven estimations.It is considered that the overestimations for minimum extremes and underestimations for the maxima are caused by the formulae that do not use temperature.These findings obtained support from the Uzunlar et al. results [19].
Additionally, in order to assess the evaluation of the methods in a more detailed and sensitive way, eight different performance evaluation indices were calculated for ten various methods, for which the EP was predicted on a daily temporal resolution.When we take into account all the performance evaluation criteria, the best score was obtained with the F(T, W) approach.When precipitation is added to the equation, a slight negative effect was detected on the NSCE performance.CRMSE, PCC, and other error indices demonstrated that modeling that just takes into account the precipitation, wind speed, and relative humidity does not perform with a sufficient accuracy in terms of estimating the EP time series.Moreover, the F(T, M, W), F(P, T, W), F(P, M, T), and F(P, T, M, V) models tend to slightly overestimate with respect to the measurements, while underestimations are dominant in the other functions.The PCC results prove the importance of the wind speed in addition to temperature parameters in estimating the EP via artificial intelligence techniques, because the correlation increased from a value of 0.42 to a high value of 0.74 when the wind was added in addition to precipitation.When the effect of temperature was also added to the analysis, the PCC value jumped to a value of 0.92.According to the aforementioned analysis, the overall model performance was higher than Kumar et al.'s [50] analysis in India.They used three scenarios and four techniques and captured a PCC ranging between 0.505 and 0.587.The NSCE values vary between −0.227 and 0.188 in terms of the multi-linear regression analysis.
The estimation of the pan evaporation using artificial intelligent techniques is a nonlinear phenomenon affected by multiple climatological elements such as temperature, relative humidity, and wind speed, while its performance has a negative correlation with respect to precipitation parameters.However, all the developed models in this study could not capture the variability of the EP extreme values greater than the 95th quartile.The model's efficiency can be improved if the altitude, land use, and land cover data sets are included in the simulations to forecast EP values.Another limitation of the study can be expressed as the performance evaluation in terms of the spatiotemporal resolution.This issue can be addressed by analyzing the methodology in different study areas.This global phenomenon receives much attention from the scientific community and is sensitive to temporal resolution.Multiple parameters capturing EP requires rigorous experimental studies [53,54].

Conclusions
In this study, firstly, the gaps in the EP measurements were filled by building a new meteorological station in Kahramanmaras, and readings were carried out for a three-year time period.The relationship between the Meteorology and KSU stations in situ measurements is explained with the help of the PCC and scatter plots.These two data sets provide PCC values of 0.75.It has been shown that two meteorological stations close to each other can be used to measure the EP values to complete a time series.However, the KSU station has an underestimation tendency for extreme EP values, like 6 mm d −1 and higher.
After combining the aforementioned data sets, EP values are retrieved using a completed continuous time series through an artificial intelligence technique.For this aim, ten various EP estimations were attained via a non-linear artificial intelligence model through the Datafit program for Kahramanmaras province, using daily resolution temperature, precipitation, wind speed, relative humidity, and combinations of these meteorological parameters.The distribution of the obtained functions and the EP values derived using these equations are presented with box plots.While the completed EP measurements vary in the range of 0.01 and 18.2 mm d −1 , the F(T, W) product has a range of 0.01 to 14.4, and the values for the F(P) based estimations change between 2.9 and 7.1.The lower (25%) and upper (75%) quartiles values are 2 and 11 mm d −1 for the measurements and the F(T, W)-based EP estimations are captured in this variation.The results indicate that the best performance was obtained based on the wind and temperature EP products, while the EP estimations driven with precipitation alone are insufficient for capturing the variability of the in situ measurements.It was concluded that the temperature parameter is an indispensable phenomenon in evaporation predictions in artificial intelligence modeling.In addition, it can be concluded that while wind speed meteorological data improved the EP results in the nonlinear analysis, precipitation was not an effective parameter.Additionally, the ability to retrieve the EP almost overlaps relative to the readings between the 25th and 75th quartiles for all approaches, particularly if temperature is employed as the independent variable in the functions.The distribution of EP values obtained was narrower when the precipitation-based methods were included.This reveals the importance of temperature and wind speed in obtaining EP values between the first and third quartiles.
In addition to box plots, scatter plots of eight statistical error metrics are visualized in the study.It was observed that for all indices, the EP values generated using solely the functions derived from precipitation obtained insufficient results, with NSCE = 0.17, NNSE = 0.54, CRMSE = 62%, RMSD = 4.2 mm d −1 , PCC = 0.42, R 2 = 0.17, P Bias = −6.44%,and MAE = 3.58 mm d −1 .Moreover, the results demonstrate that wind speed improves the performance of the EP values predicted by the artificial intelligence techniques while the temperature is revealed as a primary factor in it.For example, the NSCE is 0.85 for temperature-based estimations, while it increased to 0.87 when wind is also taken into account in the predictions.A similar pattern is also observed in terms of correlation: PCC = 0.92 and R 2 = 0.85 become PCC = 0.94 and R 2 = 0.87.Statistical metrics that support the previous findings reveal the importance of temperature and wind speed in EP predictions via the artificial intelligence approach.
The proposed study clearly demonstrated the potential of obtaining evaporation estimates in basins for the sparsely gauged or ungauged catchments lacking sufficient in situ measurements with a nonlinear approach.The implementation of the Datafit model has the ability to derive EP data series.Additionally, the effect of the meteorological parameter which had the most impact on obtaining these observations was investigated with an artificial intelligence model.The evaluation based on the in situ measurements shows that the F(T, W) product has the highest correlation and lowest bias in terms of the derived EP compared to the other functions.It can be concluded that wind speed is important and temperature is an inevitable parameter in this approach.The advantage of the proposed methodology is that it conveniently estimates the EP parameters quickly with a small amount of data (i.e., temperature and wind speed) with sufficient accuracy.
Testing the applicability of the obtained results in different study areas can motivate future studies of basins lacking adequate ground-based records.Another subject for future research can be error propagation, which is evaluating these results with respect to the traditional EP empirical approaches to support any hydrological analyses.Additionally, the results from this paper can also motivate a comparison with the performance of global evaporation products.

Figure 1 .
Figure 1.Flow diagram used in the methodology.

Figure 1 .
Figure 1.Flow diagram used in the methodology.

Sustainability 2023 , 16 Figure 2 .
Figure 2. Map of Kahramanmaras city and the location of meteorological stations.

Figure 2 .
Figure 2. Map of Kahramanmaras city and the location of meteorological stations.

Figure 4 .
Figure 4. Class A evaporation pan at KSU Civil Engineering Meteorology Station (photograph was taken by M.O.Dis on 13 September 2023).

Figure 4 .
Figure 4. Class A evaporation pan at KSU Civil Engineering Meteorology Station (photograph was taken by M.O.Dis on 13 September 2023).

Figure 5 .
Figure 5. Sca er plot of daily ET measurements in Kahramanmaras.

Figure 5 .
Figure 5. Scatter plot of daily ET measurements in Kahramanmaras.

Figure 7 .Figure 7 .
Figure 7. Sca er plots of error metrics at daily temporal resolutions (a) Nash-Sutcliffe coefficient of efficiency, (b) normalized Nash-Sutcliffe coefficient of efficiency, (c) centered root mean square