Cross Assessment of Twenty-One Different Methods for Missing Precipitation Data Estimation

The results of metrological, hydrological, and environmental data analyses are mainly dependent on the reliable estimation of missing data. In this study, 21 classical methods were evaluated to determine the best method for infilling the missing precipitation data in Ethiopia. The monthly data collected from 15 different stations over 34 years from 1980 to 2013 were considered. Homogeneity and trend tests were performed to check the data. The results of the different methods were compared using the mean absolute error (MAE), root-mean-square error (RMSE), coefficient of efficiency (CE), similarity index (S-index), skill score (SS), and Pearson correlation coefficient (rPearson). The results of this paper confirmed that the normal ratio (NR), multiple linear regression (MLR), inverse distance weighting (IDW), correlation coefficient weighting (CCW), and arithmetic average (AA) methods are the most reliable methods of those studied. The NR method provides the most accurate estimations with rPearson of 0.945, mean absolute error of 22.90 mm, RMSE of 33.695 mm, similarity index of 0.999, CE index of 0.998, and skill score of 0.998. When comparing the observed results and the estimated results from the NR, MLR, IDW, CCW, and AA methods, the MAE and RMSE were found to be low, and high values of CE, S-index, SS, and rPearson were achieved. On the other hand, using the closet station (CS), UK traditional, linear regression (LR), expectation maximization (EM), and multiple imputations (MI) methods gave the lowest accuracy, with MAE and RMSE values varying from 30.424 to 47.641 mm and from 49.564 to 58.765 mm, respectively. The results of this study suggest that the recommended methods are applicable for different types of climatic data in Ethiopia and arid regions in other countries around the world.


Introduction
Hydrological, climatological, and metrological analyses are mainly based on the availability of rainfall data [1], although problems associated with missing data are common and exist for various reasons. This may be the result of stations being relocated because of urbanization, errors in the methods implemented for measuring the rainfall amount, or the breakdown of instruments for a specific period, particularly in areas of flooding [2]. The analysis results of metrological and hydrological models can be affected in cases that include rainfall data series with missing values [3]. As a result, filling the gaps left by the missing data and estimating the missing values has become very important in recent hydrological studies [2]. Data infilling approaches utilize different methods for estimating missing climatological data [4,5]. The estimation methods for missing data can be categorized into three groups: statistical, empirical, and function fitting methods [6]. However, several studies classified the approaches for infilling missing data into four methods: deterministic methods, stochastic methods, artificial intelligence methods, and geostatistical methods [7][8][9][10]. Deterministic and geostatistical approaches are the most commonly implemented and include the arithmetic average (AA), normal ratio (NR), single best estimator (SBE), inverse distance weighting (IDW), coefficient of correlation weighting (CCW), and multiple linear regression (MLR) within a 50 to 250 km radius from the target station [11][12][13][14]. Nevertheless, the challenge is choosing the most suitable method to be implemented for estimating the missing climate data [7,15]. The efficiency of these approaches varies from area to area depending on the variances in climate and the metrological elements to be estimated [13]. The slope, topography, surface, and metrological conditions are the main local factors affecting the climate elements [7]. The arithmetic simple mean method (AA), inverse distance methods (IDW), and correlation coefficient weighting method (CCW) are considered to be empirical methods [16]; the linear regression (LR), MLR, and weighted linear regression (WLR) methods are considered to be statistical methods. The use of empirical and statistical methods mainly depends on the characteristics of the missing data [17]. The application of these approaches is mainly dependent on the period of the data gap, the season, the climate in the study region, the density and distribution of stations, and the characteristics of the archived data [18,19]. Xia et al. [6] estimated the missing data based on the closet neighboring station considering a geometric weighting. Willmott et al. [20] used the arithmetic average (AA) of the data from the neighboring station to estimate the missing values [20]. Teegavarapu and Chandramouli [21] estimated the rainfall missing values based on inverse distance weighting (IDW), neural networks, and the kriging method using the data from neighboring stations [21]. The results confirmed that the accuracy of the IDW method can be improved through a better definition of weighting parameters and a surrogate measure for distances [21]. De Silva et al. [13] and Suhaila et al. [2] used different methods, such as AA, NR, aerial rainfall ratio, IDW, CCW, and a combination of the IDW and CCW methods. The results confirmed that the NR method is the best method for estimating the missing data compared with the other methods. The most suitable method to estimate missing rainfall data can change for various regions based on the rainfall patterns and the spatial distribution [13]. Pizarro et al. and Xia et al. used simple LR and MLR for predicting the missing precipitation and temperature data, respectively [6,22]. Alfaro and Pacheco applied different estimation methods to estimate missing rainfall data, including the LR model and the NR method [23]. The results of Xia et al., Alfaro and Pacheco, and Pizarro et al. confirmed that the most suitable approach is the multiple linear regression method [6,22,23]. Dastorani et al. [24] used four approaches for estimating the missing data, i.e., the NR method, CCW, an artificial neural network (ANN), and an adaptive neuro-fuzzy inference system (ANFIS) method. The ANFIS approach was found to be the most suitable method for missing flow data, whereas the performance of the ANN approach was found to be more reliable than traditional methods. The literature review confirms that there are no substantial investigations that assess the different methods for estimating missing precipitation data in arid regions such as Ethiopia, and most studies have been implemented in countries with wet climates. This study aims to investigate the application of 21 different methods for predicting the missing rainfall data in arid areas of Ethiopia and to determine the most suitable method. The 21 investigated methods are AA, NR, the geographical coordinates (GC) method, the normal ratio with geographical coordinates (NRGC), IDW, modified inverse distance weighting (MIDW), CCW, LR, MLR, MI, the Nonlinear Iterative Partial Least Squares NIPALS algorithm for missing data, UK traditional (UK), expectation maximization (EM), CS, modified coefficient correlation weighting (MCCW), modified correlation coefficient with inverse distance weighting (MCCWID), modified normal ratio with inverse distance (NRID), modified old normal ratio with inverse distance (ONRID), normal ratio inverse distance weighting with correlation (NRIDWCC), modified normal ratio based on correlation (MNR), and modified normal ratio based on square root distance (MNR-T).

Study Area and Data Analysis
The Blue Nile River is considered the main tributary of the River Nile, with a total drainage area of approximately 176,000 km 2 , around 17% of Ethiopia's total area. The study area is located in the upper Blue Nile Basin. The study area contains dry and arid areas and is indexed as a dry/arid climate [25]. The average values of the annual precipitation and temperature in the study area are 94.25 mm and 17 °C, respectively. The monthly precipitation data between 1980 and 2013 from the fifteen rain-gauge stations located in Ethiopia, namely, Motta, Adet, Amba Marim, Ancharo, Bahir Dar, Combolcha, Degelo, Dejein, Gondar, Haik, Korem, Mekane Selem, Nefas Mewcha, Yejuibe, and Yetemen, were used in this study. The monthly precipitation data used here were collected from the National Water Research Center (NWRC) of Egypt and the National Meteorological Agency (NMA) of Ethiopia. The type of climate at all stations was calculated using the De Martonne [26] aridity index (I), as shown in the following equation: where T and P are the average values of the annual temperature (°C) and precipitation (mm), respectively. Figure 1 indicates the study area. Table 1 shows the geographic coordinates of the included weather stations, their elevations, and the properties of the precipitation data.  In the current study, about 10% of the total precipitation data were randomly assumed to be missing and so needed to be estimated using the different statistical methods. The assumed missing data, in this study, were used to check the results of the presented methods by comparing the observed precipitation with the estimations. In addition to the randomly chosen missing data, the year 2011 was selected to be the example from the study period to check the performance of the applied statistical methods, i.e., the observed monthly precipitation for 2011 was compared with the estimated values. In this study, the Motta station was considered the target station. The Motta station is located almost in the middle of the study area in respect to its longitude and latitude. After performing the statistical analysis of the available data and a quality control (a homogeneity test and trend test), the challenge was to assess the performance of various classic statistical approaches for predicting the missing data of precipitation. The 21 statistical methods used for estimating of rainfall missing data are outlined below.

Simple Arithmetic Average (AA)
The AA method is known as the simple method. The AA method is extensively applied to estimate the missing data of metrological studies. The application of the AA method is acceptable if the stations are uniformly spread in the study area and the measurements of the individual station do not change greatly from the average [27]. The missing data are estimated based on the arithmetic average of the nearest stations around the target station. The gaps of data can be obtained as follows: where Yi is the missing climate value at the target station, Xi is the measured value of the climatic parameter in the surrounding stations, and n is the number of nearby stations.

Normal Ratio (NR)
The NR method was firstly recommended by [28] to estimate missing rainfall data, and was recently modified by Young (1992) [29]. The NR method is applied to compute the missing data if the normal annual rainfall of the surrounding stations exceeds 10% of the target station [30]. This method mainly depends on the mean ratio of rainfall data between neighboring stations and the target station in order to weigh the impact of each neighboring gauge. The missing rainfall data can be computed by the following equation: where Ns is the mean of available rainfall data at the target station, Ni is the mean of the available rainfall data at the ith surrounding stations, and n is the number of surrounding stations considered in the calculation of this method.

Geographical Coordinates (GC)
GC is considered a weighting technique and was proposed to compute missing rainfall data [15]. The weight coefficient is calculated based on the geographical coordinates of the stations (longitude and latitude). The position of the target station represents the center point in this method. The missing data are estimated depending on the distances from the target station to the surrounding stations according to the following equation: where xi and yi are the longitude and latitude of the ith nearby station.

Normal Ratio With Geographical Coordinates (NRGC)
The NRGC method combines both the NR and GC methods. This method is used to predict missing rainfall data. The NRGC method is commonly used and is considered to be the best method for missing data estimation as it adjusts the location of stations to achieve the best performance, combining aspects of both methods. For the NRGC method, the missing data are computed by Equation (5) IDW is a common method for filling in missing data [31]. The computation of the missing values of rainfall depends on the distance between the target station and surrounding stations. The greatest weight is applied to the nearest station. In this method, the missing data are calculated using the observed data at the nearby stations, as in Equation (6): where di is the distance from the target station to the ith surrounding station, and k is the distance of friction varying from 1 to 6 [32].
2.1.6. Modified Inverse Distance Weighting (MIDW) Golkhatmi et al. (2012) and Viale and Garreaud (2015) confirmed that elevation has an important effect on rainfall, therefore the difference in elevation between the target and neighboring stations was implemented to improve the performance of the IDW method [33,34], as can be seen in Equation (7): where hi is the absolute value of difference in elevation between the target and surrounding station, and the exponent a is a power variable. In this study, values of a and k ranging from 1 to 3 were checked, and values of a = k = 1 were adjusted to calculate the missing rainfall data.

Correlation Coefficient Weighted (CCW)
Teegavarapu and Chandramouli (2005) confirmed that the effectiveness of this method relies on the strength of the correlation between the target stations and the surroundings stations. Therefore, the equation of the IDW method was adapted to include the strength correlation as follows [21]: where ri is the Pearson correlation coefficient (rPearson) between the target station and each neighboring station.

Linear Regression (LR)
Linear regression is a statistical method used to estimate missing weather data at any gauge station with similar climatological conditions. The LR method in statistics is a technique to find a relationship between a dependent variable Y and one independent variable X. The LR method is a regression analysis and is commonly used in practical applications [35]. In the current study, the data from the Adet station was implemented to compute the missing data values of the Motta station (the target station) through the LR statistical method: where Yi is the estimated rainfall data, and Xi is the observed rainfall value of the neighboring station; a is the intercept, and b is the regression coefficient, both of which can be computed from the following equations: where and are mean values of the rainfall data in the Y and X stations, respectively.

Multiple Linear Regression (MLR)
In the MLR method, the missing rainfall data are estimated by computing the regression coefficient between the target station and the most highly correlated nearby stations [6,36]: where is the estimated rainfall data, is the observed rainfall value of the ith surrounding station, bi are the regression coefficients of the ith surrounding stations, and n is the number of nearest stations included in the calculation method.

Multiple Imputation (MI)
The MI method was proposed by Rubin [37] for infilling missing data. This method should be implemented in cases where the missing data are randomly distributed. The missing rainfall data in this method were replaced by a set of realistic values considering an uncertainty in excess of the corrected precise value of the missing data to be assigned [17]. The imputation procedure for the estimation of the missing value is repeated five times and the parameter estimates are averaged through the implementation of discrete analysis [38,39]. The multiple imputation approach can be performed in different statistical packages, for example SAS, the Amelia II package, EMCOV, SPLUS, and Mplus [40,41]. In this study, the multiple imputations were implemented through XLSTAT statistical software.

NIPALS Algorithm for Missing Data (NIPALS)
The NIPALS method was firstly proposed by [42], and was called the NILES algorithm. The NIPALS algorithm implements principle components analysis (PCA) on the datasets containing the missing data through an iterative system. It depends on computing the slopes of the least square line which crosses the origin points of the measured data. In this stage, the eigenvalues are calculated by the changes of the NIPALS components. The convergence of the NIPALS algorithm is related to the missing data percentage [43]. In this study, the NIPALS algorithm for estimating the missing rainfall data was implemented through XLSTAT software.

UK Traditional Method (UK)
The UK traditional method was proposed by the UK Meteorological Office for estimating missing metrological data (temperature and sunshine) considering a comparison between the target station and a single nearby station [4]. In the current research, the ratio of the mean rainfall in the target station (Motta station) to the mean rainfall at the neighboring station (Adet station), which has the highest correlation coefficient, was computed. Then, the missing rainfall data were estimated by multiplying the computed ratio by the rainfall data of the nearest station with the highest rPearson in relation to the target station.

Expectation Maximization (EM)
The EM method was first suggested by [44], in order to solve the problems found in the maximum likelihood technique [44]. This method combines both the statistical approach and the algorithmic application. It is used extensively by researchers for missing data problems [44]. The conditional expectation step and maximization step are the two main steps in the EM algorithm procedure. The expectation step equation gives the conditional expectations of the missing values and the model parameter estimation, whereas, the maximization step determines the estimation of the model parameters in order to maximize the log likelihood function of the complete data from the first step. The two steps are iterated until convergence is reached [45].

Closest Station Method (CSM)
In this approach, the closest station to the target station is firstly identified. Secondly, the missing weather data of the target are estimated using the closest station data. Thirdly, the estimated weather data are modified using the ratio of the long-term means for that year [4]. In the literature, different methods can be found which have a similar concept, such as the nearest neighbor (NN) and single best estimator (SIB). The nearest neighbor (NN) method is considered to be a simple method for filling in missing rainfall data. It depends on the use of the data from the nearest station to fill in the missing data of the target station [46]. The nearest station is considered to be the station with the highest rPearson in relation to the target station or the closest station based on location and distance. In this method, the values of the closed station can be implemented to fill in the missing data without any changes [47]. The SIB approach is considered as being an analogous and simple method that uses the closest neighboring station to fill the gaps of the target station. The missing data of the target station are computed using the nearest station with the highest positive value of rPearson in relation to the target station [48].

Modified Coefficient Correlation Weighting (MCCW)
The CCW method depends on the rPearson between the surroundings stations and the target station. Suhaila et al. [2] modified the CCW method by taking into account different values of the power of the rPearson to improve the CCW method and provide more weight in its calculations. The missing data can be estimated using the MCCW method as follows: where ri is the rPearson between the target station and ith nearby station, and P is the power of the rPearson, ranging from 2 to 6.

Modified Correlation Coefficient with Inverse Distance Weighting (MCCIDW)
This method is a combination of the IDW and CCW methods and is used for estimating the missing weather data values [2]. The IDW technique mainly depends on the distance from the target station to the nearest stations. The MCCIDW method gives a power for the correlation coefficient and the distance, ranging from 1 to 6, and the missing data can be calculated from the following formula: 2.1.17. Modified Normal Ratio with Inverse Distance (NRID) The NRID method is a combination of the modified NR method [29] and the IDW method and is considered the simplest approach for estimating missing weather data [49]. The modified NR method mainly reflects the positive spatial correlation between the target station and the nearby stations. The following formula can be used to compute the missing rainfall data using the NRID method [50]: A combination of the NR method and the ID method improves the results of both methods in terms of filling in missing data [2]. The ONRID method is a combination of the modified old normal ratio method and the ID method [50]. The missing data are estimated using the following equation [2,50]: ∑ .  [50]. The NRIDC is considered to be the same method as the NRID, proposed by Suhaila et al. (2008), with the addition of the correlation coefficient [2,50]. According to this method, the missing data can be estimated using the following formula: .
where the power of the correlation coefficient P should be more than 4.

Modified Normal Ratio Based on Correlation (MNR)
Young (1992) modified the old NR method by including the correlation coefficient between the target station and the surrounding stations [29]. Therefore, the weighting of this method and the formula for calculating the missing data are given as follows: Tang et al. (1996) first discussed the impact of the distance from the target station to the ith surrounding station [11]. In 1996, they proposed the MNR-T method for filling in precipitation data gaps in Malaysia. The MNR-T method is calculated as follows [11]: where the power of the distance p ranges from 1.5 to 2.

Methods Performance
The efficiency of the filling data was compared using six different error indices: mean absolute error (MAE), root mean square error (RMSE), coefficient of efficiency (CE), similarity index (S-index), skill score (SS), and rPearson. The error measures were used to compare the estimations with the observed values. The six error indices are given as follows: i.
Mean absolute error (MAE) The MAE is considered to be a valuable measure used in different model evaluations. It measures the value of the estimated error. This method is recommended by Willmott et al. (2009) [51]. The best method for estimating the massing value should be related to the lowest computed value of MAE. The value range of MAE is between 0.0 and +∞, [52]. The MAE is computed using the following equation: ii.

Root mean square error (RMSE)
The RMSE is usually implemented to evaluate the performance and efficiency of the different estimated models in meteorological research studies [53][54][55]. It measures the difference between the estimated and observed values. The best method gives the lowest computed value of the RMSE. The RMSE value varies from 0 to +∞. The RMSE is presented as follows: iii.

Coefficient of efficiency (CE)
The coefficient of efficiency values range from −1 to +1. A CE value of 1.0 shows a perfect match between the estimated data and the measured data. A value of CE of 0.0 shows that the method's estimations are as accurate as the mean value of the measured data. However, a value of CE less than 0.0 shows that the mean of the observed values is a better estimator than the model. A value of CE close to 1.0 shows a good accuracy [4]. CE is calculated from the following equation: iv.

Similarity index (S-index)
The S-index is the index of agreement for evaluating the method performance; this involves the agreement percentage between the estimated and observed values. The values of the S-index vary between 0.0 in a case of complete disagreement and 1.0 in a case of perfect and reliable agreement [56]. The similarity index is computed as follows: v.
Skill score (SS) The SS is used to measure the quality of the method in terms of estimating the missing data. A calculated positive value of SS shows that the used method can improve the estimates. The closer the SS value is to 1.0, the more reliable the estimation. An SS value of 1.0 indicates a perfect estimation of the missing data [57][58][59]. The skill score index (SS) is calculated as follows: vi.
Pearson correlation coefficient (rPearson) The correlation coefficient indicates the relationship strength between the observed and estimated data. A higher positive value of Pearson coefficient shows that the estimates will be high or low values when the observed is high or low, respectively, and gives evidence that the used method is suitable for predicting missing data [13,60]. The correlation coefficient can be calculated from the following equation: where is the estimated value, is the observed value, and and are the average precipitation values of estimated and observed data, respectively.

Results and Discussion
In the following section, the results of the current study will be shown in two subsections. Firstly, the results of the accuracy of the archived data and secondly the analysis of results of applied methods will be shown by comparing the observed monthly precipitation with estimated values considering different statistical indices. Table 2    In order to check the accuracy of the available data, the normal homogeneity test (SNHT), Pettitt's test, and the Mann-Kendall (MK) trend test were applied to the precipitation data using the XLSTAT software (Table 3 and Table 4). Alexandersson (1986) developed the SNHT in order to discover the variety in precipitation data series [61]. The MK trend test was developed by [62][63][64] to evaluate if there is a monotonic downward or upward trend of the target parameter over time. In the SNHT, the null hypothesis (H0) indicates that the used data are homogenous, while the alternative hypothesis (H1) indicates that the used data are heterogeneous.

Accuracy of the Station Data
In the MK trend test, the null hypothesis (H0) is randomness and the absence of any trends in the data; on the other hand, the alternative hypothesis (H1) indicates that the data are non-random and trends exist in the data. In the results of the MK trend test, if the p-value exceeds the significance level (α), the null hypothesis (H0) is acceptable; otherwise, the alternative hypothesis (H1) is confirmed. The results of the SNHT show that the p-value for all stations ranged from 0.069 (Yetemen) to 0.995 (Haik), with the consequence that the null hypothesis (H0) is acceptable and the used monthly data at each station is homogenous.  Table 3 and 4 confirmed that the monthly precipitation data are independent and homogenous at all of the used stations, and as a result can be implemented with confidence.

Comparison Between the Proposed Mmethods Results
In the current study, we hypothesized that 10% of the data might not be measured and so may need to be estimated. Firstly, the 21 methods were applied to the random missing monthly data (24 months), and the observed precipitation of the target station was compared to the estimated values from the applied methods. Secondly, the performance of the applied methods was checked to estimate the monthly precipitation of the years 2004 and 2011. Thirdly, the accuracy of the proposed methods was checked by estimating the precipitation over the period from 1990 to 1998. The number of the nearest stations engaged in the different estimation methods was reliant on the method itself. For example, for the UK, CS, and LR methods, only data from one station is used (that with the highest correlation coefficient in relation to the target station). On the other hand, in the AA, NR, GC, NRGC, IDW, MIDW, CCW, NIPALS, EM, MCCW, MCCWID, NRID, ONRID, NRIDC, MNR, and MNR-T methods, data from all the neighboring stations were used in the calculations. In the MLR method, data from five nearby stations with strong correlation values with the target station were used in the method calculations. The best obtained results for the MLR method were observed when the monthly precipitation data at the Adet, Bahir Dar, Degelo, Haik, and Yetemen stations were implemented. In the MI method, different combinations of input numbers regarding neighboring stations (differing from one station to five stations) were implemented to determine which of them performed best. In the MI method, the best results were attained by including data from five stations. The best obtained results for the MI method were observed when the monthly precipitation data at the Adet, Bahir Dar, Combolcha, Dejein, and Gondar stations were used and compared with measured data. Table 5 shows the results of six different performance criteria for comparison between the different statistical methods.  Table 5 shows that among the presented  methods, the simple NR, NRID, MLR, NRIDWCC, ONRID, IDW Table 5 and Figure 2 shows that of the implemented statistical methods, the NR, MLR, IDW, CCW, and AA methods are the most reliable and accurate. It can be noted that using the NR method achieves accurate estimations with an rPearson of 0.945, mean absolute error of 22.90 mm, RMSE of 33.695 mm, CE index of 0.998, similarity index of 0.999, and skill score of 0.998. The NR is followed by the MLR method with a Pearson correlation coefficient of 0.940, mean absolute error of 23.181 mm, RMSE of 35.573 mm, CE index of 0.998, similarity index of 0.999, and skill score of 0.998. The results confirm that the NR, MLR, IDW, CCW, and AA methods are the more suitable and accurate methods to estimate missing precipitation data. These results support the findings by [4,6,12,65]. Most importantly, the NR, MLR, IDW, CCW, and AA methods may be used in other arid areas with similar climate conditions. Time series charts and scatter diagrams comparing the observed and estimated values of precipitation are shown in Figures 2, 3, and 4. Figure 2 compares the observed precipitation with the estimated values for the random missing data. The analysis of results of Table 5 Table 5. Figure 3 shows the results compared between the observed and estimated precipitation in respect to the monthly data of the year 2004 using the 21 Figure 4, for comparison between estimated monthly data with observed values for the year 2004, confirm the results of Figure 2 for estimating random missing data and the analysis of Table 5. Figure 4 shows the compared results between the observed and estimated precipitation in respect to the monthly data of the year 2011 using the 21 Figure 4, for comparison between estimated monthly data with observed values for the year 2011, confirm the results of Figures 2 and 3 for estimating random missing data and the analysis of Table 5. Figure 5 shows the compared results between the observed and estimated precipitation in respect to the monthly data of the period from 1990 to 1998 using the 21 Figure 5, for comparison between estimated monthly data with observed values for the years from 1990 to 1998, confirm the results of Figures 2, 3, and 4 for estimating missing data and the analysis of Table 5. Figure 6 shows the scatter plot for comparing the monthly observed and estimated values of precipitation using different methods. The R-squared (R 2 ) values varied across the methods, ranging from 0.7472 to 0.893. The best estimates based on R 2 were achieved for NR and AA methods, with R 2 equal to 0.893 and 0.891, respectively, followed by MNR (0.

Conclusions
In the current study, the monthly precipitation at 15 stations located in the Upper Blue Nile Basin UBNB was considered for the period from 1980 to 2013. The collected data were first tested using the MK trend test, SNHT, and Pettitt's test. The precipitation data used was homogenous in all of the included stations, and no trends existed. Twenty-one different statistical methods for estimating missing precipitation data were applied; these were the AA, NR, GC, NRGC, MIDW,  24.128 to 28.860 mm, respectively. On the other hand, the CS, EM, UK, LR, and MI methods achieved the lowest accuracies, with MAE and RMSE values ranging from 30.424 to 47.641 mm and from 49.564 to 58.765 mm, respectively. As a result of their simplicity and high accuracy, the NR, MLR, IDW, CCW, and AA methods are recommended for filling in missing climate data in arid climates. The results reported in this research suggest that the recommended methods are applicable for arid regions in other countries.