A Multimethod Analysis for Average Annual Precipitation Mapping in the Khorasan Razavi Province (Northeastern Iran)

: The spatial distribution of precipitation is one of the most important climatic variables used in geographic and environmental studies. However, when there is a lack of full coverage of meteorological stations, precipitation estimations are necessary to interpolate precipitation for larger areas. The purpose of this research was to ﬁnd the best interpolation method for precipitation mapping in the partly densely populated Khorasan Razavi province of northeastern Iran. To achieve this, we compared ﬁve methods by applying average precipitation data from 97 rain gauge stations in that province for a period of 20 years (1994–2014): Inverse Distance Weighting, Radial Basis Functions (Completely Regularized Spline, Spline with Tension, Multiquadric, Inverse Multiquadric, Thin Plate Spline), Kriging (Simple, Ordinary, Universal), Co-Kriging (Simple, Ordinary, Universal) with an auxiliary elevation parameter, and non-linear Regression. Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the Coefﬁcient of Determination ( R 2 ) were used to determine the best-performing method of precipitation interpolation. Our study shows that Ordinary Co-Kriging with an auxiliary elevation parameter was the best method for determining the distribution of annual precipitation for this region, showing the highest coefﬁcient of determination of 0.46% between estimated and observed values. Therefore, the application of this method of precipitation mapping would form a mandatory base for regional planning and policy making in the arid to semi-arid Khorasan Razavi province during the future.


Introduction
Precipitation systems can be mainly regrouped in convective and stratiform events, and the main worldwide observed rainfall patterns can be considered as a combination of these two components [1]. Considering the tempo-spatial changes of precipitation on the one hand, and an often relatively low density of rain gauge stations on the other hand, it is common to estimate, rather than measure, the precipitation distribution over a larger area. The obtained isohyetal map describing this distribution is the basis of land-use planning, environmental studies, disaster management, hydrological analysis, and water resources management [2][3][4]. However, the accuracy of the production of these maps depends on the methods used for the interpolation of the observed precipitation for the total considered area. Many different interpolation methods have been used in climate analysis so far [5][6][7][8][9][10][11][12], and finding the most suitable precipitation interpolation method is still a research desideratum for many regions [4,[13][14][15][16][17].

Methods
During this study, we tested five different interpolation methods to map annual precipitation in the Khorasan Razavi province that are described below. The models used include deterministic models (Inverse distance weighting and Radial basis function), geostatistical models (Kriging and Co-Kriging), and non-linear regression. The data were processed using GIS 10.5 and Curve Expert Pro Ver 2.6.5 software (Hyams Development, Alabama, United States).

Inverse Distance Weighting (IDW)
In the IDW method, the value of a single point is more strongly related to the nearby points around instead of points farther away. The equation is given by Equation (1): (1) Z o is the estimated point o, Z i is the value of Z at point i, d i is the distance between points i and o, s is the number of points used in the estimate, and k is specified power. The power k controls the degree of local impact. A power of 1 means a constant amount of change in weighting of the known points with distance (linear interpolation), and a power of 2 or >2 indicates that the weighting of known points close to the point of interest is much greater than the distance would suggest. The degree of local impact also depends on the number of known points that are used for the estimate. Some studies showed that a lower number of points (6 points) provides more favorable estimates than more points (12 points). The most important feature of the IDW method is that all predicted values range between the maximum and minimum known points [28].

Radial Basis Function (RBF)
RBF uses a generic function that depends on the distance between the points of interpolation and sampling [29]. The mathematical equation is given by Equation (2): In this equation, ψ(d) is the radial base function, and (d j ) shows the distance between the points of sampling and the predicted point x. F(x) represents the process of the function and the fundamental member for polynomials with degrees less than m. The RBF calculations were based on the functions of Completely Regularized Spline (CRS), Spline with Tension (ST) and Multiquadric (MQ), Inverse Multiquadric (IMQ), and Thin Plate Spline (TPS). All equations are given below in Equations (3)- (7).
TPS : ψ(d) = c 2 d 2 ln(cd) (7) d: Is the distance from sample to prediction location c: Is a smoothing factor I 0 (): Is the modified Bessel function E: Euler's constant [30].

Kriging Method
Kriging differs from the other interpolation methods, because it can determine the quality of interpolation with the magnitude of error that occurred in predicting values. The Kriging method uses a semi-variogram to measure the spatially related component or spatial self-correlation (Please see the explanation of the semi-variogram in Appendix A). Kriging is calculated as given below in Equation (8): Here, ∧ Z(x 0 ) is the estimated random field (prediction attribute) value at point x 0 , and µ is the stationary mean treated as constant over the whole region of interest (RoI). The parameter ω i is the weight assigned to the ith interpolating point calculated from the semi variogram, and Z(x i ) is the measured attribute value at a point x i [31].
If there is a spatial dependence in the dataset, the points of contact that are close to each other must have a lower semi-variance compared with more distant points. Ordinary Kriging (OK), Universal Kriging (UK), and Simple Kriging (SK) are the Kriging methods that are used in this research. The difference between OK and SK is the assumption of stationarity, which expects the mean and distribution to remain constant throughout the region. SK utilizes this assumption while OK does not, and instead recalculates the mean across the modeled area by a shifting search radius. But in reality, the mean value of some spatial data cannot be assumed to be constant in general but varies, since it also depends on the absolute location of the sample. For this sake, we use the UK method aiming to predict Z(x) at unsampled places as well [32,33]. One precondition for using Kriging methods is the normalization of the data, or at least a near normally distributed dataset so that this method can give the best estimate with the lowest error coefficient. Here we used the log-normal method for normalization.

Co-Kriging Method
Co-Kriging is a multivariate version of Kriging that uses the spatial correlation between the primary variable (annual precipitation) and an auxiliary variable (here the elevations of the gauging stations) to estimate the main variable. Similar to the Kriging method, Ordinary Co-Kriging (OCo-K), Universal Co-Kriging (UCo-K), and Simple Co-Kriging (SCo-K) are used here. In Co-Kriging, along with calculating the semi variogram of the primary and secondary variable, it is necessary to calculate the cross semi-variogram that expresses the spatial correlation between those two variables (Please see the explanation of the semi-variogram in the Appendix B). The Co-Kriging equation (Equation (9)) is as follows: In this equation, λ i is the weight of the main Z variable in x i position, λ k is the weight of the U auxiliary variable in x k position, and U(x k ) is the observed value of the x k auxiliary variable in the position [34].

Non-Linear Regression
Non-linear regression analysis enables us to predict the changes of dependent variables through independent variables, and the contribution of every independent variable is determined by the presentation of a dependent variable [35]. Similar to Co-Kriging, we selected elevation as an independent variable, since this factor generally has a significant effect on the amount of precipitation. For interpolation using non-linear regression methods, first appropriate regression models have to be determined. This is achieved by calculating the degrees of correlation between the selected dependent and independent variables for several potential models. To find the best model for precipitation zoning in the study area, 69 non-linear regression models were evaluated (please see a list of all models in Table S1 of Supplementary Material). From these we selected the Sinusoidal, Hoerl, and Steinhart-Hart equations, since these gave the highest correlations between annual precipitation and elevation. The equations and coefficients of these selected models are as follows: Equation (10) and Coefficients of Sinusoidal regression: Equation (11) and Coefficients of Hoerl regression: Atmosphere 2021, 12, 592 7 of 19 Equation (12) and coefficients of Steinhart-Hart equation regression: The values of x in Equations (10)- (12) are the station elevation data taken from the Khorasan Razavi province Meteorology Organization and the digital elevation model (DEM) of the studied region (GTOPO30: https://earthexplorer.usgs.gov (accessed on 28 April 2021) and ALOS PALSAR: https://vertex.daac.asf.alaska.edu (accessed on 28 April 2021)).

Validation Criteria for the Applied Methods
There are various criteria for validating interpolation methods, with one of the most important being k-fold cross-validation [36]. This method estimates a value for every observation point by means of interpolation. Subsequently, the estimated value is compared with the observed value, and the model with the least error is regarded as the superior one. There are various ways to compare estimated and observed values, and the most important ones include Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the Coefficient of Determination (R 2 ) [37][38][39][40]. The equations to calculate RMSE (Equation (13)), (MAE) (Equation (14)), and R 2 (Equation (15)) are given below: In Equations (13) and (14), z(x i ) is the observed value,ẑ(x i ) is the estimated value and n is the number of sites. In Equation (15), X n is the observed amount, Y n is the estimated amount and n is the number of data. Lower values of MAE and RMSE and higher values of R 2 indicate a better performance of the interpolation method. In case of a very precise estimator, MAE and RMSE would be zero, and R 2 would be 1. Table 2 shows the descriptive statistics of the annual precipitation data of the studied gauging stations. According to that table, average annual precipitation in the study area is 229.77 mm. The coefficient of variation (CV) is 25.67 mm, indicating a relatively low spatial variation of regional precipitation. Furthermore, that low CV value also indicates the absence of large outliers in the regional dataset. Based on the results of the cross-validation, the parameters (k) and maximum and minimum number of neighborhood points (s) were optimized to minimize RMSE and MAE. Doing so, a power (k) of 1, a minimum number of neighborhood points of 2 and a maximum number of 6 neighborhood points gave the lowest possible error values of 53.07 for RMSE and of 1.108 for MAE, and the highest value for of 0.166, respectively. The resulting interpolated annual precipitation map of the Khorasan Razavi province is shown in Figure 2.

Radial Basis Function (RBF)
Of the five Radial Basis Functions that were evaluated, Spline with Tension showed the lowest RMSE error, an intermediate MAE error, and the highest 2 R value, and therefore turned out to be the best method (Table 3). It should be noted that the number of neighboring points was at least 2 and at most 8, and for optimizing and minimizing the estimation errors [41], the Kernel function 5/195387 was used. Interpolated annual precipitation maps of the Khorasan Razavi province derived from the five RBF functions are shown in Figure 3.

Radial Basis Function (RBF)
Of the five Radial Basis Functions that were evaluated, Spline with Tension showed the lowest RMSE error, an intermediate MAE error, and the highest R 2 value, and therefore turned out to be the best method (Table 3). It should be noted that the number of neighboring points was at least 2 and at most 8, and for optimizing and minimizing the estimation errors [41], the Kernel function 5/195387 was used. Interpolated annual precipitation maps of the Khorasan Razavi province derived from the five RBF functions are shown in Figure 3.

Kriging
Ordinary, Universal, and Simple Kriging methods were used in this study. The results of the cross-validation show that Simple Kriging (SK) had the largest MAE error but the least RMSE error and the highest coefficient of determination, and therefore turned out to be the best method (Table 4). In order to achieve the best estimate with the SK method, we used the Hole Effect's theoretical model to fit the semi-variogram with a maximum of 6 neighborhoods. The interpolated annual precipitation maps of the Khorasan Razavi province derived from the three applied Kriging methods are shown in Figure 4.

Kriging
Ordinary, Universal, and Simple Kriging methods were used in this study. The results of the cross-validation show that Simple Kriging (SK) had the largest MAE error but the least RMSE error and the highest coefficient of determination, and therefore turned out to be the best method (Table 4). In order to achieve the best estimate with the SK method, we used the Hole Effect's theoretical model to fit the semi-variogram with a maximum of 6 neighborhoods. The interpolated annual precipitation maps of the Khorasan Razavi province derived from the three applied Kriging methods are shown in Figure 4.

Co-Kriging
Given that Co-Kriging generally uses an auxiliary parameter, we used the elevation as such an effective auxiliary parameter for precipitation interpolation in the Khorasan Razavi province. Simple (SCoK), Ordinary (OCoK), and Universal (UCoK) Co-Kriging were compared with each other. Despite showing a higher MAE error compared with SCoK, OCoK and UCoK showed the lowest and identical RMSE errors and the highest and identical values (Table 5). Therefore, OCoK and UCoK turned out to be the best Co-Kriging methods. The interpolated annual precipitation maps of the Khorasan Razavi province derived from the three Co-Kriging methods are shown in Figure 5.

Co-Kriging
Given that Co-Kriging generally uses an auxiliary parameter, we used the elevation as such an effective auxiliary parameter for precipitation interpolation in the Khorasan Razavi province. Simple (SCoK), Ordinary (OCoK), and Universal (UCoK) Co-Kriging were compared with each other. Despite showing a higher MAE error compared with SCoK, OCoK and UCoK showed the lowest and identical RMSE errors and the highest and identical R 2 values (Table 5). Therefore, OCoK and UCoK turned out to be the best Co-Kriging methods. The interpolated annual precipitation maps of the Khorasan Razavi province derived from the three Co-Kriging methods are shown in Figure 5.

Non-Linear Regression Method
According to our preceding analysis, the best non-linear regression models with respect to annual precipitation and elevation data in the study area are the Sinusoidal, Hoerl, and Steinhart-Hart equations, since these show the highest correlations between annual precipitation as the dependent and elevation as the independent variable: The correlations were 0.607, 0.576, and 0.566, and the values were 0.364, 0.332, and 0.324, respectively. In Figure 6, the regression lines of the three different non-linear regression methods are shown in a scatter plot that displays altitude and annual precipitation of the gauging stations in the study area.

Non-Linear Regression Method
According to our preceding analysis, the best non-linear regression models with respect to annual precipitation and elevation data in the study area are the Sinusoidal, Hoerl, and Steinhart-Hart equations, since these show the highest correlations between annual precipitation as the dependent and elevation as the independent variable: The correlations were 0.607, 0.576, and 0.566, and the R 2 values were 0.364, 0.332, and 0.324, respectively. In Figure 6, the regression lines of the three different non-linear regression methods are shown in a scatter plot that displays altitude and annual precipitation of the gauging stations in the study area. Atmosphere 2021, 12, x FOR PEER REVIEW 13 of 20 Figure 6. Scatter plot displaying altitude and annual precipitation data of the gauging stations in the study area, and the regression lines of the three applied non-linear regression methods.
In order to integrate altitudinal zoning, equations and correlation coefficients of the three models were combined with the digital elevation model (DEM) of the study area, resulting in interpolated annual precipitation maps of the Khorasan Razavi province (Figure 7). According to the cross-validation, on the one hand the Sinusoidal model shows the highest RMSE and MAE errors, but on the other hand it shows the highest value of all non-linear regression models ( Table 6). As can be seen by the regression lines in Figure  6, maximal annual precipitation calculated with the Hoerl and Steinhart-Hart equations is much higher than that calculated with the Sinusoidal model, resulting in values for maximal annual precipitation of 757 mm (Hoerl) and 1076 mm (Steinhart-Hart) compared with 329 mm (Sinusoidal). By comparing the calculated values of these three models with the observed data ( Figure 6), it can be seen that the Hoerl and Steinhart-Hart models have estimated 357 and 676 mm more than the highest measured amount of precipitation in the In order to integrate altitudinal zoning, equations and correlation coefficients of the three models were combined with the digital elevation model (DEM) of the study area, resulting in interpolated annual precipitation maps of the Khorasan Razavi province (Figure 7). According to the cross-validation, on the one hand the Sinusoidal model shows the highest RMSE and MAE errors, but on the other hand it shows the highest value of all non-linear regression models ( Table 6). As can be seen by the regression lines in Figure  6, maximal annual precipitation calculated with the Hoerl and Steinhart-Hart equations is much higher than that calculated with the Sinusoidal model, resulting in values for maximal annual precipitation of 757 mm (Hoerl) and 1076 mm (Steinhart-Hart) compared with 329 mm (Sinusoidal). By comparing the calculated values of these three models with the observed data ( Figure 6), it can be seen that the Hoerl and Steinhart-Hart models have estimated 357 and 676 mm more than the highest measured amount of precipitation in the According to the cross-validation, on the one hand the Sinusoidal model shows the highest RMSE and MAE errors, but on the other hand it shows the highest R 2 value of all non-linear regression models ( Table 6). As can be seen by the regression lines in Figure 6, maximal annual precipitation calculated with the Hoerl and Steinhart-Hart equations is much higher than that calculated with the Sinusoidal model, resulting in values for maximal annual precipitation of 757 mm (Hoerl) and 1076 mm (Steinhart-Hart) compared with 329 mm (Sinusoidal). By comparing the calculated values of these three models with the observed data ( Figure 6), it can be seen that the Hoerl and Steinhart-Hart models have estimated 357 and 676 mm more than the highest measured amount of precipitation in the region, respectively, whereas the Sinusoidal model has estimated 71 mm less than the actual amount of observed precipitation in the region. Therefore, it can be concluded that the amount of precipitation estimated by the Sinusoidal model is closer to reality, so that this model was selected for further evaluation. Table 6. Errors and correlations between observed and estimated annual precipitation amounts of the three selected non-linear regression methods.

Discussion
The results of the analysis for the five compared interpolation methods are comparatively shown in Table 7 and Figure 8 (in case of several sub-methods the result(s) of the best-performing method(s) was/were selected). Ordinary and Universal Co-Kriging showed the lowest errors with an R 2 value of 0.469, as well as showing the highest R 2 values of all methods, and therefore gave the most correct results of interpolation for the Khorasan Razavi province. Among the deterministic and geostatistical methods, with an R 2 value of just 0.166, IDW seems to be the worst method for annual precipitation interpolation in the Khorasan Razavi province. However, with an R 2 value of 0.119, the (best-performing) regression method with Sinusoidal function gave the worst annual precipitation estimation in the study area of all methods. Generally, due to the regular application of altitude as an auxiliary elevation variable when estimating precipitation for an area, in the case of a missing high correlation between elevation and precipitation regression methods are not capable of interpolating precipitation with high accuracy. In our present study, the average correlation between annual precipitation and elevation of the best-performing Sinusoidal function was only 60% (Figure 7). Therefore, our precipitation estimation using regression models was very weak ( Table 7). The observed low correlation between annual precipitation and elevation could be due to: (i) The lack of rain gauge and synoptic stations at altitudes above 2313 m a.s.l. led to a lack of observational data for these higher regions. Therefore, the model has to determine precipitation for these altitudes using recorded data from lower altitudes, what decreases the general accuracy of this model; (ii) An unequal distribution of observation stations in the area ( Figure 1); (iii) Luff-lee effects, leading to different precipitation amounts upwind and downwind of moisture-laden airflow at similar altitudes. Similarly, also during former studies, it was found that in case of a low correlation between annual precipitation and elevation regression models show rather bad results compared with the Kriging and Co-Kriging methods [42][43][44]. Generally, due to the regular application of altitude as an auxiliary elevation variable when estimating precipitation for an area, in the case of a missing high correlation between elevation and precipitation regression methods are not capable of interpolating precipitation with high accuracy. In our present study, the average correlation between annual precipitation and elevation of the best-performing Sinusoidal function was only 60% (Figure 7). Therefore, our precipitation estimation using regression models was very weak ( Table 7). The observed low correlation between annual precipitation and elevation could be due to: (i) The lack of rain gauge and synoptic stations at altitudes above 2313 m a.s.l. led to a lack of observational data for these higher regions. Therefore, the model has to determine precipitation for these altitudes using recorded data from lower altitudes, what decreases the general accuracy of this model; (ii) An unequal distribution of observation stations in the area ( Figure 1); (iii) Luff-lee effects, leading to different precipitation amounts upwind and downwind of moisture-laden airflow at similar altitudes. Similarly, also during former studies, it was found that in case of a low correlation between annual precipitation and elevation regression models show rather bad results compared with the Kriging and Co-Kriging methods [42][43][44].
Generally, there is no model that can decisively be selected as the best one for all regions. With respect to Iran, for some regions with a high correlation between precipitation and elevation, regression models turned out to be the best [4,14-17], whereas in most Generally, there is no model that can decisively be selected as the best one for all regions. With respect to Iran, for some regions with a high correlation between precipitation and elevation, regression models turned out to be the best [4,[14][15][16][17], whereas in most regions, Kriging and Co-Kriging gave more accurate results compared with the other methods [15,16,20]. Similarly, despite some studies describing IDW as the best method [45][46][47], many studies emphasize the high accuracy of geostatistical methods (Kriging and Co-Kriging) for precipitation estimation for other regions as well, with Co-Kriging using the auxiliary elevation variable often showing the highest accuracy [9,12,[48][49][50][51][52][53][54][55]. Similar to in the above-mentioned studies, also during this study an acceptable accuracy of (Ordinary and Universal) Co-Kriging, using the auxiliary elevation variable, was found. Unlike regression models that only use elevation as an auxiliary parameter, Co-Kriging takes into account the autocorrelation factor and the statistical relationship between data in the region, leading to more reliable estimates compared with regression methods.
The annual precipitation map obtained by ordinary Co-Kriging shows highest annual precipitation concentrated in the mountainous northern part of the province. In contrast, in the southern and western part, showing lower altitudes, annual precipitation decreases. Corresponding with this natural water supply, the population density of the province decreases from North to South (Figure 9). Generally, finding a suitable method for interpolating and estimating precipitation in an area can have many applications based on different aspects. On the one hand, the accurate interpolation of precipitation can be of great help for land-use planners to allocate suitable agricultural, industrial, tourist, residential and other uses. On the other hand, for controlling and managing natural and environmental disasters such as floods, dust phenomena, landslides, desertification, deforestation, etc., which are directly and indirectly related to the amount of precipitation in the region, accurate estimations of precipitation can be very useful and effective. Consequently, an accurate interpolation of precipitation in a region provides decision-makers with a comprehensive view of precipitation conditions, which can be used to plan well and control water resources. Therefore, finding a suitable model for accurate spatial estimation of precipitation in the arid to semi-arid study area, which is always in crisis in terms of precipitation, is a major challenge.
like regression models that only use elevation as an auxiliary parameter, Co-Kriging takes into account the autocorrelation factor and the statistical relationship between data in the region, leading to more reliable estimates compared with regression methods.
The annual precipitation map obtained by ordinary Co-Kriging shows highest annual precipitation concentrated in the mountainous northern part of the province. In contrast, in the southern and western part, showing lower altitudes, annual precipitation decreases. Corresponding with this natural water supply, the population density of the province decreases from North to South (Figure 9). Generally, finding a suitable method for interpolating and estimating precipitation in an area can have many applications based on different aspects. On the one hand, the accurate interpolation of precipitation can be of great help for land-use planners to allocate suitable agricultural, industrial, tourist, residential and other uses. On the other hand, for controlling and managing natural and environmental disasters such as floods, dust phenomena, landslides, desertification, deforestation, etc., which are directly and indirectly related to the amount of precipitation in the region, accurate estimations of precipitation can be very useful and effective. Consequently, an accurate interpolation of precipitation in a region provides decision-makers with a comprehensive view of precipitation conditions, which can be used to plan well and control water resources. Therefore, finding a suitable model for accurate spatial estimation of precipitation in the arid to semi-arid study area, which is always in crisis in terms of precipitation, is a major challenge.

Conclusions
The precipitation map is a main database for environmental issues and studies such as disaster management, hydrological analysis, agricultural management, etc. Therefore, figuring out an accurate interpolation method to estimate the distribution of precipitation in regions that largely lack synoptic and rain gauge stations is an urgent task. During this study, for the first time we systematically compared five different methods to interpolate annual precipitation for the Khorasan Razavi province in northeastern Iran. This should allow extensive simulations of regional precipitation with high confidence. Similar to former studies, our results showed that with a coefficient of determination of 46.9% Ordinary Co-Kriging, also taking into account elevation, gave the most reliable results. In contrast, regression methods showed the largest errors and lowest coefficient of determination of only 12%. Therefore, our study suggests that the application of precipitation mapping using Ordinary Co-Kriging with an auxiliary elevation variable would form a mandatory base for future water supply-related regional planning and policy making in the Khorasan Razavi province. This holds especially true against the background of the current global climate change leading to a regional aridification trend in this province, and given its