Comparative Analysis of Different Spatial Interpolation Methods Applied to Monthly Rainfall as Support for Landscape Management

Landscape management requires spatially interpolated data, whose outcomes are strictly related to models and geostatistical parameters adopted. This paper aimed to implement and compare different spatial interpolation algorithms, both geostatistical and deterministic, of rainfall data in New Zealand. The spatial interpolation techniques used to produce finer-scale monthly rainfall maps were inverse distance weighting (IDW), ordinary kriging (OK), kriging with external drift (KED), and ordinary cokriging (COK). Their performance was assessed by the cross-validation and visual examination of the produced maps. The results of the cross-validation clearly evidenced the usefulness of kriging in the spatial interpolation of rainfall data, with geostatistical methods outperforming IDW. Results from the application of different algorithms provided some insights in terms of strengths and weaknesses and the applicability of the deterministic and geostatistical methods to monthly rainfall. Based on the RMSE values, the KED showed the highest values only in April, whereas COK was the most accurate interpolator for the other 11 months. By contrast, considering the MAE, the KED showed the highest values in April, May, June and July, while the highest values have been detected for the COK in the other months. According to these results, COK has been identified as the best method for interpolating rainfall distribution in New Zealand for almost all months. Moreover, the cross-validation highlights how the COK was the interpolator with the best least bias and scatter in the cross-validation test, with the smallest errors.


Introduction
The accurate spatial modelling of rainfall distribution has a widely recognized value among scholars and practitioners of different scientific and engineering fields (e.g., hydrology, water balances and resources management, regional impacts of global change, etc.) [1][2][3][4][5]. Moreover, sustainable landscape management requires spatially interpolated data, whose outcomes are strictly related to models and geostatistical parameters adopted. In fact, temperature and rainfall patterns significantly contribute to the configuration of natural vegetation types [6] and affect the suitability of territory to specific crops. Moreover, an accurate mapping rainfall distribution represents a key data input in modelling and monitoring the effects of climate change [7,8]. Referring to these mentioned uses of spatial rainfall models at different timescales, it is worth highlighting their fundamental role in the parametrization of crop models simulating crop growth as well as in the implementation of hydrological models aiming at flood forecasting [9].
A prerequisite for implementing the spatial rainfall distribution is the availability of official climate databases with sufficient temporal and spatial completeness. Indeed, to implement reliable hydrological or environmental models, there is the need to accurately assess the choice and set-up of spatial interpolation models that can affect the obtained results introducing significant errors. Rainfall is not a regular phenomenon and is characterized by high spatial variability.
Usually, meteorological networks are widely dispersed in a territory without a defined sample scheme. Moreover, rain gauge networks just collect point estimates, and in most cases, their number is quite limited [10,11]. Thus, there is the need to estimate rainfall values at unrecorded locations using the values at surrounding sites [12] to spatially interpolate the original discrete point measurements so as to obtain continuous surfaces. Notwithstanding the possibility of using different precipitation products as rainfall proxies, rain gauge databases today represent the primary reference to implement hydrological models [13], largely because of the temporal consistency of these data that in most cases are multi-decades time-series.
In the last two decades, several authors have attempted to identify the most reliable and accurate method of spatial rainfall interpolation in different geographical areas and spatial domains (continental, national, regional, local) [9][10][11][14][15][16]. Referring to the spatial interpolation of the rainfall phenomenon, we have to consider that the most reliable interpolator may vary depending on the different regions in which it is applied [3,17].
In the scientific literature, several spatial interpolation methods have been proposed and compared by scholars to map rainfall data in a reliable way in various world areas. There is no recent comprehensive review analysing the spatial interpolators tested for rainfall spatialisation to our best knowledge. Indeed, the work of Li and Heap [18] found more than fifty spatial interpolation models applied in the estimation of spatial rainfall spatial distribution.
However, other authors have demonstrated that the reliability of the obtained results also depends on the point sampling density and distribution [34] while current scientific literature does not support the suitability for an interpolation method to be applied in every region of the Earth [24]. Therefore, a general consensus does not exist in the literature on the best method for interpolating monthly rainfall, and several research articles comparing different spatial interpolation methods have been published in recent years [10,14,15,35].
The present paper aimed to compare different methods for interpolating spatial patterns of monthly rainfall in New Zealand. Considering that, in New Zealand, agriculture represents one of the most important economic sectors, including with reference to overseas exports, the knowledge of a reliable spatial distribution of rainfall in the different areas of this country is crucial. Moreover, these kinds of research findings are helpful in implementing sustainable landscape management programmes that require precise knowledge of diverse spatial phenomena and whose outcomes are frequently related to models and geostatistical parameters adopted. In the above-depicted scope, this research provides a comparative analysis with significant novelties. Among these, we proposed spatial interpolation approaches of monthly precipitation based on multivariate geostatistics integrating auxiliary variables such as elevation and geographic coordinates.

Study Area and Observation
New Zealand is an island country located in the southwestern Pacific Ocean, east of the Australian continent, between 34 • to 47 • S and between 164 • to 179 • E, with an area of 270,000 km 2 . It consists of two main islands presenting a northeast-southwest oriented shape. The North Island is mainly characterized by plains but it also presents mountains reaching a maximum altitude of more than 2700 m a.s.l. Conversely, the South Island is mainly mountainous, with the Southern Alps reaching an altitude of 3764 m (Figure 1).

Study Area and Observation
New Zealand is an island country located in the southwestern Pacific Ocean, east of the Australian continent, between 34° to 47° S and between 164° to 179° E, with an area of 270,000 km 2 . It consists of two main islands presenting a northeast-southwest oriented shape. The North Island is mainly characterized by plains but it also presents mountains reaching a maximum altitude of more than 2700 m a.s.l. Conversely, the South Island is mainly mountainous, with the Southern Alps reaching an altitude of 3764 m ( Figure 1). Generally, New Zealand's weather patterns reflect those of the oceanic mid-latitude locations [36,37], but it is also affected by some physical factors [38]. The first factor is the location of the country in the Pacific Ocean, which influences air circulation and seasonal climate. The air masses coming from the tropical zones of the Pacific lead to a humid weather, with fresh summers and mild winters, and temperatures oscillating between 18 °C and 21 °C, although snow and frost occur as a consequence of the oceanic currents from Antarctica. The presence of the mountains in the eastern side of the North Island and of the 750-km long, over 3000-m high Southern Alps traversing the South Island constitutes the second factor affecting air circulation and determining New Zealand's climate [39,40]. As a result, large differences in the rainfall volumes occur even over short distances between the mountain areas and the eastward plans. Moreover, summers in the northern part of the North Island are influenced by the westerly wind perturbations [38]. Finally, although more than 2000 km divide New Zealand from the Australian continent, this significantly influences New Zealand seasonal climate because of the cyclogenesis region moving through the Tasman Sea which is responsible for the significant differences in the Generally, New Zealand's weather patterns reflect those of the oceanic mid-latitude locations [36,37], but it is also affected by some physical factors [38]. The first factor is the location of the country in the Pacific Ocean, which influences air circulation and seasonal climate. The air masses coming from the tropical zones of the Pacific lead to a humid weather, with fresh summers and mild winters, and temperatures oscillating between 18 • C and 21 • C, although snow and frost occur as a consequence of the oceanic currents from Antarctica. The presence of the mountains in the eastern side of the North Island and of the 750-km long, over 3000-m high Southern Alps traversing the South Island constitutes the second factor affecting air circulation and determining New Zealand's climate [39,40]. As a result, large differences in the rainfall volumes occur even over short distances between the mountain areas and the eastward plans. Moreover, summers in the northern part of the North Island are influenced by the westerly wind perturbations [38]. Finally, although more than 2000 km divide New Zealand from the Australian continent, this significantly influences New Zealand seasonal climate because of the cyclogenesis region moving through the Tasman Sea which is responsible for the significant differences in the precipitation amount occurring over short distances especially in the South Island [41][42][43][44]. Clear sky conditions in winter lead to considerable surface cooling, which, in turn, results in stable lower atmospheric conditions. This has an impact eastward since it prevents the western airflow from significantly affecting New Zealand. On the contrary, large convective hot air masses coming across the Tasman Sea cause robust airflows affecting New Zealand in summer [25].
New Zealand's climate has been carefully described by The National Institute of Water and Atmosphere Research (NIWA) of New Zealand [45]. According to NIWA, warm, humid summers and mild winters affect the North Island, in which winter is the rainiest period of the year, while summer and autumn are characterized by tropical storms and intense precipitation. The Central Region of the North Island is the least affected by winds that heavily sweep many areas of the country. Stable, dry, and warm summers with unstable cool winters are typical of this region. The south-west part of the North Island does not present many climate extremes with a stable weather in summer, which is usually warm, and at the beginning of autumn and unsettled, although not very cold, winters. The sunniest area of New Zealand is the northern part of the South Island, characterised by warm, dry and stable summer. In the western side of the South Island is affected by extreme yearly precipitation higher than average, although occasional dry spells my occur between the end of the summer and the winter months. The eastern and inland areas of the South Island are characterized by low mean annual rainfall and by long dry episodes in the summer season.
With the aim to perform a spatial analysis of the rainfall in New Zealand, monthly observations have been extracted from the NIWA database for the period 1951-2012. Several studies on New Zealand climate [36,[46][47][48][49][50] made extensive use of this database, which has been acknowledged as a high-quality database with almost complete records. In 2012 the original database included monthly observation from 3011 rain gauges (1 station per 89 km 2 ). In order to perform a reliable analysis, in this study only the stations working in 2012, with more than 50 years of observation and less than 5% of missing data have been considered. As a result of this preliminary check a final database of monthly observation from 294 rain gauges (1 station per 913 km 2 ) has been considered ( Figure 1).

Methodology
With the aim to determine the best technique in reproducing the actual surface, both deterministic and geostatistical algorithms were applied to generate the rainfall maps of New Zealand at a monthly scale. The performance of each method was evaluated. All the procedures mentioned above have been implemented in an R statistical computing environment, mainly using the 'gstat' package [51].

The Inverse Distance Weighting (IDW)
One of the most popular and adopted deterministic methods in reproducing the actual surface is the IDW [52], which is a non-linear interpolation technique that considers the measured values in neighbouring points to evaluate a value for any unmeasured location. A neighbourhood about the interpolated point is identified and a weighted average is taken of the observation values within this neighbourhood. The weights are a decreasing function of distance computed as follows.
wherev i is the value to be estimated, v i is the known value, d p i . . . , d p n are distances from n data points to the power of p of the point estimated.
The lower the exponent, the more uniformly all neighbours are incorporated into the calculation (regardless of their distance), and therefore, the "smoother" the estimated surface. The higher the exponent, the more accentuated and "unsettled" is the surface because only the nearest neighbours' weight is integrated into the interpolation.
To minimise the interpolation errors, a power value of 2 for monthly time steps has been performed and applied in the estimation for IDW.

Geostatistical Methods
Although the IDW is largely applied, it lacks any direction-specific (anisotropic) information, thus ignoring the spatial correlations which are not incorporated into the estimation. In some scientific contributions, the IDW was used as a basic approach (i.e., a reference) to assess the spatial prediction [53]. This weakness does not occur in geostatistical approach. For interested readers a detailed presentation of geostatistical theories can be found in [26,[54][55][56]. Usually the spatial interpolation is performed by using a weight of observed regionalised values to evaluate a regionalised value in unmeasured points. Kriging is a widely used interpolation technique that takes into account the spatial dependent correlation of environmental variables assigning more weight to neighbouring points [56].
Kriging also provides an uncertainty estimate. Semi-variogram modelling [57] is the first and the key step between spatial description and spatial prediction in kriging computation. After computing the empirical semi-variogram, for all pairs of locations in a given distance, this is fitted with a theoretical model, which is a function of the lag of data pair values, that is used to evaluate the weight by means of the different equation systems of the several krigings: ordinary kriging (OK), kriging with an external drift (KED), and cokriging (COK). In this study, we did not provide local variograms, considering that, as demonstrated by Lloyd [53], their use leads to a small reduction in prediction errors.

Ordinary Kriging (OK)
Ordinary kriging is the most popular kriging interpolator. It provides an estimate of the value of the random variable Z(x) at an unsampled point x 0 based on the weighted average of observed neighbouring points Z(x α ) within a specific area [56]: where λ α are the ordinary kriging weights and n(x 0 ) is the number of measured values of pairs of points multiplied by their spatial distance h. The weights are obtained so that the estimate is unbiased and the variance is minimised.

Kriging with an External Drift (KED)
In geostatistics, the kriging with an external drift (KED) method allows the use of auxiliary information, available at all locations, which has an effect on the local spatial trend of the dependent variable. So, the predictions are made as with kriging, with the difference that the covariance matrix of residuals is extended with the auxiliary predictors.
The secondary variable's spatial behaviour is similar to an indicator of the general trend typical of predictions' representation. In this study, the KED method was applied because it allows incorporation of the elevation and geographic coordinates as the secondary variables. This choice is in accordance with the rainfall observation analysed, which have a global elevation-controlled trend in the south-western to north-east direction.
Let z(x) be the external deterministic variable. The model for the random function is: where Y R (x) is a residual stationary random field, z(x) is known everywhere and a and b are coefficients to be estimated if Y R (x) is needed. Kriging of Y at a location x 0 where it is unknown is a linear combination of the data: where the Y(x i )'s are the values at the n measured sites and the λ i 's solve the following kriging system: The KED requires a less demanding semivariogram analysis compared to OK which requires a semivariogram for each of the covariates.
The last constraint is added so that the linear predictor Y*(x 0 ) filters the effect of the external drift, µ 1 and µ 2 are the Lagrange multipliers that account for the unbiasedness constraints with the minimal prediction variance.

Ordinary Cokriging (COK)
The last methodology applied in the present study is ordinary cokriging (COK). It offers additional advantages over ordinary kriging. It involves the use of a secondary variable (covariate) that is cross-correlated with the primary or sample variable of interest. The secondary variable is usually sampled more frequently and regularly, thus allowing estimation of unknown points using both variables 'globally' for the mean of all estimates but also 'conditionally' for the estimates within individually specified grade categories. This can aid in minimizing the error variance of the estimation.

Cross-Validation
The process of deciding whether the numerical results quantified by different interpolators are acceptable as descriptions of the data, is known as cross-validation, which tests the model's capability to predict data. Performing the cross-validation, some of the data are removed before training begins. Then when training is done, the data that were removed can be used to test the performance of the learned model on "new" data. The actual error incurred in this process is measured by the difference between the actual and estimated value [58][59][60].
The accuracy of calculation was measured by the mean absolute error (MAE) and the root-mean-square error (RMSE). The MAE and RMSE are calculated for the data set as: where |e i | = |y i −ŷ i |, actual = y i , and predicted =ŷ i . The closer the values of RMSE are to zero the more accurate the model results will be. The RMSE value cannot be considered a performance criterion to define the reliability of the kriging estimate [12], even if it is used in all geostatistical methods. Many studies regarding spatial interpolation of rainfall [34], evaluation of the best interpolation methods for mapping climatic variables at regional scale [61], and spatial distribution of rainfall [62] have performed the MAE and RMSE calculation in order to compare the results. A graphical result of the goodness of fitting actual and estimated values is given by the scatterplot: the closer its slope is 1, the more accurate the estimation is.

Results and Discussion
In this section, a comparison of the results obtained through deterministic and geostatistical algorithms is presented. In Table 1, the results of the exploratory statistical analysis of monthly rainfall are shown. Mean monthly rainfall ranges between 83.2 mm (February) and 117.4 mm (July). The maximum and the minimum monthly values have been detected in January (632.6 mm) and in July (20.4 mm), respectively. The coefficients of variation evaluated for the monthly rainfall values range between 49% and 66%, thus evidencing a low spatial heterogeneity of rainfall in New Zealand ( Table 1). The exploratory statistical analysis of the precipitation observation was mainly performed to evaluate whether the monthly values fit a normal distribution. Indeed normality is a desirable property for kriging, that will generate the best absolute estimate only if the random function fits a normal distribution [30]. As a result, appreciably different monthly mean and median values and high values of skewness coefficients were observed (Table 1). Therefore, a priori logarithmic transformation of the data was performed after which normality was obtained (similar mean and median values and close to zero values of the coefficients of skewness).
Due to the station density, 1 station per 913 km 2 , the spatial variability has been considered to be equal in the different directions, and thus an omnidirectional semivariograms, evaluated for the monthly rainfall observation, has been considered to carried out the spatial dependence analysis. To represent the empirical semivariogram by fitting a theoretical model, it was found that using a combination of theoretical models results in a more accurate fit onto the empirical semivariance than using a single model. Therefore, the semivariance of three models, precisely Gaussian, spherical, and circular, were used to fit the empirical semivariograms of monthly rainfall observation and for avoidance of negative interpolations. For all cases, two models combined showed the best fittings. The results of variograms parameters are different for each month (not shown). An example of experimental semi-variograms and their fitted models for the three geostatistical methods, computed for June, is shown in Figure 2.
In order to match the interpolation performance of the adopted algorithms, the crossvalidation has been applied. The cross-validation results (scatter plots of predicted versus measured values) for June are shown for the four methods in Figure 2.

The Inverse Distance Weighting (IDW)
For the IDW interpolation, which uses a simple algorithm based on distance, the closest measured values had the most influence. Typically, the spatial distribution evaluated with this method showed marked areas in correspondence of the high or low rainfall values (Figure 3). In fact, generally, this method produces spikes around the measured points [63], although it can depict well local detailed changing as shown by Shi et al. [64]. On the other hand, Chen and Liu [65], using IDW in estimating the spatial distribution of rainfall in the middle of Taiwan, obtained an higher accuracy in the dry season, also revealing a better ability of this estimator to predict the spatial distribution of small events.
For the several months, the coefficient of determination (R 2 ) of the IDW of measured versus predicted rainfall, showed the lowest values among the different interpolation methods, ranging between 0.5318 and 0.7324 ( Figure 2 and Table 2). Moreover, as a consequence of the cross-validation test, this method did not result, in any analysed month, the best interpolating rainfall method (Table 3).  Figure 3. Mean monthly precipitation interpolated using inverse distance weighting (IDW) method. Figure 3. Mean monthly precipitation interpolated using inverse distance weighting (IDW) method.

Ordinary Kriging (OK)
The ordinary kriging maps are more smoothed than the maps by IDW, especially for maximum and minimum rainfall values representations (Figure 4). Compared to the others geostatistical methods, lower results for OK were achieved by correlation coefficients of measured versus predicted monthly rainfall, particularly for the wettest months, June (0.8236), July (0.8277), and August (0.8074) (Figure 2 and Table 2).  Considering that the OK method tends to underestimate mean rainfall values, as well as IDW, it is not the best interpolator for spatially analysing the rainfall observation in any considered month. It is confirmed by the MAE and RMSE results.

Kriging with an External Drift (KED)
Improvements to accuracy for monthly rainfall interpolations in KED approach were realized by the use of elevation, extracted from a digital elevation model, and geographic coordinates as the secondary information. In the area without data, elevation, mainly extracted from a DEM, represents easily available data, which can be used to be incorporated into multivariate geostatistics of rainfall. A minimal improvement in the interpolation accuracy for monthly rainfall occurred since elevation and geographic coordinates were used as secondary variables, considering that a significantly smaller error of estimates was not achieved. Figure 5 shows the rainfall maps for 12 months obtained by using the KED technique. The results of the correlation coefficients of measured versus predicted monthly rainfall, which ranging from 0.67 to about 0.85 ( Figure 2 and Table 2), attest that the KED interpolator, in few months, operated better than the COK interpolator, which was, however, the best methods for most months. In fact, differently from the IDW and the OK, the MAE and RMSE results indicated that the KED method was the best approach in some autumn and winter months (Table 3).

Ordinary Cokriging (COK)
A comparison between the use of secondary information, such as elevation, both in KED and COK interpolators, confirms that the accuracy improvement was not reached considering the error of estimates. Consequently, although adding secondary variables can increase precision, in the case study the produced maps by COK are similar to those obtained by KED and OK. Figure 6 shows the rainfall maps for 12 months obtained by using the COK technique. Therefore, as already evidenced in previous studies [12] and confirmed in this one, when the correlation between rainfall and elevation is too small, the advantages derived by using multivariate techniques is negligible. The values of the correlation coefficients of measured versus predicted monthly rainfall ranged between 0.68 and 0.84 ( Figure 2 and Table 2). The COK interpolator can be evaluated as the best approach to analysing rainfall observation in many analysed months (Table 3) as confirmed by the MAE and the RMSE results.

Comparison of the Interpolation Methods
The analysis of deterministic and geostatistical methods allowed to identify strengths and weaknesses of the applicability of the different algorithms to monthly rainfall in New Zealand. Since IDW considers only the distance between estimated and sampled points and leaves out the pattern of spatial dependence of rainfall observation, as expected, substantially different results have been obtained by applying the IDW and the geostatistical methods.
In Table 3 results of the cross-validation are shown in terms of MAE and RMSE values, providing clear indications of the effectiveness of kriging in the spatial interpolation of monthly rainfall observation in New Zealand. In fact, the IDW technique has been clearly outperformed by the geostatistical approaches, thus confirming past studies (e.g., [12,63]) which found IDW to be less accurate than geostatistical interpolators in the estimation of monthly and annual rainfall.   Specifically, considering the RMSE, the COK has been identified as the most accurate interpolator in 11 months with the exception of April when the KED resulted more precise. However, taking into account the MAE, the KED has been identified as the most Specifically, considering the RMSE, the COK has been identified as the most accurate interpolator in 11 months with the exception of April when the KED resulted more precise. However, taking into account the MAE, the KED has been identified as the most accurate interpolator in April, May, June and July, and the COK in the remaining months. The COK can be thus indicated as the best method for the spatial interpolation of rainfall in New Zealand for almost all months. The IDW and OK methods were not the best approaches for interpolating rainfall in any analysed month. Overall, the COK evidenced the best least bias and scatter in the cross-validation test, with the lowest errors. Finally, interesting results can be obtained from the visual comparison among the maps obtained from the application of the different geostatistical interpolation algorithms. In fact, similar rainfall spatial distributions, have been produced with the geostatististical interpolators, and all these distributions seem to be consistent with natural precipitation. In particular, as demonstrated in previous studies [12,14,53,[66][67][68], adding elevation and geographical coordinates as secondary variables in the application of the KED and the COK, did not lead to a significant improvement of the results, that is explainable considering the weak correlation identified between precipitation and the secondary variables. In fact, as pointed out by Goovaerts [12], better results of the COK with respect to the OK can be observed with an increase in the correlation between precipitation and altitude. Moreover, Goovaerts [26] evidenced that the impact of the auxiliary data on the COK interpolations generally depends on the correlation between primary and secondary variables but also on their spatial continuity patterns. Perhaps, as suggested by Kyriakidis et al. [66], using monthly time series of covariates (e.g., monthly wind speed or humidity) as additional information in the KED and the COK instead of variables fixed in time (e.g., topography and geographical coordinates) can lead to a much better cross-validation than those obtained by the OK. Unfortunately, these variables are available in New Zealand only for short observation periods and in very few stations.

Conclusions
Agriculture in New Zealand is the largest sector of the tradable economy; consequently, knowledge of the spatial distributions of rainfall in this country is essential for water resource management. In fact, New Zealand presents large spatial differences in the rainfall distribution. As an example, although the Southern Alps are one of the rainiest places in the world, the east side of the South Island often suffers from droughts. Moreover, sustainable landscape management requires spatially interpolated data, whose outcomes are strictly related to models and geostatistical parameters adopted.
In this paper, in order to generate precipitation maps at monthly scale, both deterministic (inverse distance weighting) and geostatistical (ordinary kriging, kriging with an external drift, and ordinary cokriging) methodologies have been applied and the prediction performance of each method was evaluated through cross-validation and visual examination of the precipitation maps produced. Results of the cross-correlation confirmed the worse accuracy of IDW than geostatistical methods. In particular, although the COK has been identified as the best method in the interpolation of monthly rainfall in New Zealand, the visual comparison of the spatial results obtained with the different geostatistical methods evidenced similar rainfall distributions, thus demonstrating that no significant results have been achieved by adding elevation and geographical coordinates as secondary variables. Further analysis could be conducted to improve the spatial distribution of monthly rainfall by adding new secondary variables, such as wind speed or humidity.