Comparative Analysis of Spatial Interpolation Methods in the Mediterranean Area : Application to Temperature in Sicily

An exhaustive comparison among different spatial interpolation algorithms was carried out in order to derive annual and monthly air temperature maps for Sicily (Italy). Deterministic, data-driven and geostatistics algorithms were used, in some cases adding the elevation information and other physiographic variables to improve the performance of interpolation techniques and the reconstruction of the air temperature field. The dataset is given by air temperature data coming from 84 stations spread around the island of Sicily. The interpolation algorithms were optimized by using a subset of the available dataset, while the remaining subset was used to validate the results in terms of the accuracy and bias of the estimates. Validation results indicate that univariate methods, which neglect the information from physiographic variables, significantly entail the largest errors, while performances improve when such parameters are taken into account. The best results at the annual scale have been obtained using the the ordinary kriging of residuals from linear regression and from the artificial neural network algorithm, while, at the monthly scale, a Fourier-series

Abstract: An exhaustive comparison among different spatial interpolation algorithms was carried out in order to derive annual and monthly air temperature maps for Sicily (Italy).Deterministic, data-driven and geostatistics algorithms were used, in some cases adding the elevation information and other physiographic variables to improve the performance of interpolation techniques and the reconstruction of the air temperature field.The dataset is given by air temperature data coming from 84 stations spread around the island of Sicily.The interpolation algorithms were optimized by using a subset of the available dataset, while the remaining subset was used to validate the results in terms of the accuracy and bias of the estimates.Validation results indicate that univariate methods, which neglect the information from physiographic variables, significantly entail the largest errors, while performances improve when such parameters are taken into account.The best results at the annual scale have been obtained using the the ordinary kriging of residuals from linear regression and from the artificial neural network algorithm, while, at the monthly scale, a Fourier-series

Introduction
Climatic data are used for scientific and technical purposes in many different areas of environmental research and resource management [1], hydrology and agriculture [2].
Climate series are often limited in spatial and temporal coverage.The management and maintenance of gauge stations to obtain reliable data can be very expensive and problematic.In order to tackle this problems, it is useful to develop spatial interpolation techniques to assess climatic variables at ungauged sites.In general, spatial interpolation techniques use observations of the same variable made at other sites.The methods used to estimate precipitation and temperature at ungauged sites can be grouped into: (I) weighting methods which, in turn, can be classified into deterministic interpolation methods (inverse distance weighting, non-linear interpolation as spline techniques, etc.) and statistic interpolation methods (different varieties of kriging); and (II) data-driven methods (regression, artificial neural networks, etc.) [3].
Even though spatial interpolation techniques have been largely analyzed and compared, an optimum spatial interpolation algorithm is difficult to identify.Indeed, the suitability of a spatial interpolator is strongly related to the specific variable considered, the specific context and the spatial scale desired for mapping [4][5][6][7].Moreover, other factors may also influence the choice of the interpolation method and the accuracy of the results.Among these, station density [8], the duration and the nature of the climate variable [9] (e.g., temperature and sea level pressure are continuous in both time and space, whereas precipitation fields are spatially discontinuous on shorter timescales and more continuous on longer timescales), and geographical factors, such as elevation [10], aspect and distance to the coast [11], are probably the most important.The comparison among algorithms based on deterministic and data-driven methods or methods that integrate stochastic and regressive approaches has usually demonstrated the superiority of the latter two approaches, making them a good solution to the spatial interpolation of climatological variables [12][13][14].
Among data-driven algorithms, the regression methods can be considered as direct and explicit procedures.These methods use ancillary geographical information as independent variables to improve the estimation of the spatial distribution of the considered climate variables.Several studies present GIS-based spatial interpolation algorithms in order to derive tools for the analysis and the characterization of the spatial structure of precipitation for different applications [15].Claps et al. [16] applied regression analysis to annual and monthly average temperatures to characterize the temperature regime in Italy.Data from 738 weather stations were used to estimate the temperature normals.In particular geo-regressive methods were used, taking into account geographic and morphological parameters, and through a stepwise regression analysis, the significant variables were detected.The temperature regime was reproduced with a two-harmonic Fourier series, leading to satisfactory results for large-scale climatic characterization.
Another data-driven algorithm widely used to derive the spatial distribution of climatic data is the artificial neural network (ANN) [17,18].Among the applications of this approach, Rigol et al. [19] used a feed-forward back-propagation network method for the interpolation of geographical data that incorporated both trend and spatial association and applied the method to the daily minimum temperatures in England and Wales.Coulibaly and Evora [20] performed a comparison of six different types of ANN techniques for assessing precipitation and temperature at ungauged sites.The evaluation of the accuracy of the different models, carried out using daily series from 15 validation weather stations, highlighted that multi-layer perceptron (MLP) was the most effective method.
Several geostatistical methods have been used for assessing the performances of the spatial processing of precipitation fields in several works focused on this issue, e.g., [21][22][23].As highlighted by Haining et al. [24], geostatistical methods can be used to perform the characterization of the spatial structure of the process considered and provide useful insights into the uncertainty associated with the interpolation analysis.Hudson and Wackernagel [25] analyzed the spatial structure of mean temperature using variograms computed for different directions.Since this anisotropic variogram for universal kriging does not fully explain the spatial variation in temperature, the authors exploited the correlation with elevation, adopting the kriging with elevation as external drift, obtaining better results.
Geostatistics were also used together with ANNs.In particular, Demyanov et al. [26] proposed a two-step spatial interpolation method named direct neural network residual kriging: the first step is a data-driven approach, which includes estimating the large-scale spatial structure by an ANN, while the second step is the analysis of residuals carried out by using a geostatistical method; final estimates are produced as a sum of ANN estimates and ordinary kriging estimates of residuals.A similar approach was followed by Antonic and Krizan [27], who built empirical models based on neural networks for seven climatic variables using 127 weather stations; the residuals coming from the ANN were spatially interpolated by kriging and used as a model correction.
In order to obtain yearly averaged temperature maps providing complete and continuous information in the territory, several algorithms of spatial interpolation are presented and applied in this study to annual and monthly average temperature data of Sicily (Italy) measured at 84 sites.Considering the relevant role of elevation in estimating temperature, e.g., [28,29], a set of specific data-driven and regressive approaches have been chosen.In these methods, the elevation information, as well as other physiographic information, provided by a digital elevation model (DEM), were added to improve the estimation of missing data.Moreover, simple univariate methods have been considered in order to compare their performances with those of more complex methodologies, evaluating the improvements.These different models were first calibrated using a modeling subset and then compared through a validation procedure.Finally, the best method was used to derive maps of annual and monthly average temperature, aiming at the preparation of a climate atlas for Sicily.

Methods and Study Area
The selection of methods has been performed in order to exploit the range of interpolators features and potentialities.Following the considerations reported above, methods have been classified among univariate methods and variable aided methods.Among univariate methods, the inverse distance weighting (IDW) has been selected as the most simple spatial interpolation procedure.The radial basis functions (RBF) provide a general mathematical tool that is able to identify a continuous surface representing the variable behavior.Ordinary kriging (OK) has been selected as the method for the exploitation of the spatial variability, as can be observed by means of specific tools, such as the semivariogram.The evaluation of variable-aided interpolators has been first explored with simple linear regression models.Later, more complex methods have been tested.In particular, the geographically-weighted regression has been applied in order to consider the spatial variability of the regressive relationship, and the ANN has been chosen because it is able to model the non-linear relationship between variables.Finally, residual kriging has been explored across different base variable-aided methods in order to evaluate the effects of further modeling the residual variability by means of other methods.Analyses have been performed using ESRI ArcGIS and MATLAB software.

Climatic and Geographic Features of the Area
This study concerned the largest island in the Mediterranean Sea, Sicily (Italy), extending over an area of 25,700 km 2 .The ancillary information regarding elevation was embedded through the use of a DEM of the region having a resolution of 100 m (Figure 1).The surface of the island is mainly characterized by a hilly morphology (62%) and, for the rest, by a mountainous and flat morphology among which the widest is around Catania, in the eastern part of the island.The overall island average elevation is around 400 m, but the range of variation of elevation is between 0 and 3263 m (the Etna volcano, the highest peak of Sicily).
Sicily is a region with a temperate-mesothermal (Mediterranean) climate, with a dry summer, having an average temperature in the hottest month greater than 22 • C, with a precipitation regime more intense in the coldest season.Some physiographic parameters were also considered in this study in order to evaluate their relationship with temperature measurements.Following the criteria of Prudhomme and Reed [30], such physiographic and geographic factors, chosen taking into account the temperature pattern over the area, are the minimum absolute distance from the sea, d min , the angle α formed by the minimum distance vector and the south, the distance d i from the sea in the eight cardinal directions i and the azimuthal angle β i of the horizon in the eight cardinal directions.
The latter variable is the obstruction factor [31], defined as the angle subtended by the highest topographical barrier in the i-th direction, such that tan(β i ) = ∆H/∆X, with ∆H as the difference of elevation between the barrier and the station, and ∆X as the distance between the two points.These variables have been derived using spatial analysis techniques in a GIS environment, using a vector representation of the Sicilian coastal line and the DEM.These variables were used to produce the three following parameters: -the distance measure, D s , that is the geometric mean of the distance from the sea in the eight cardinal directions: -the aspect variable, A s , a combined measure of aspect orientation and sea proximity: -the concavity index, C, obtained by weighting the azimuthal angle iin the eight directions:

Climatic Dataset
The temperature dataset used in this study was provided by OA-ARRA (Osservatorio delle Acque-Agenzia Regionale dei Rifiuti e delle Acque) and comes from 84 weather stations with observations recorded from 1924 to 2006.The entire station dataset was divided into two subsets: (I) the modeling subset, used to calibrate the different interpolation methods, containing about 70% of the stations (N t = 59); and (II) the validation subset, used to validate the calibration results, containing the remaining 30% of the stations (N v = 25).The partition of the dataset into two subsets was done by trying to maintain the distribution of sites, both in the horizontal domain and in the altitudinal distribution.
In order to provide the estimate ẑ of the temperature variable z, both annual or monthly, at an ungauged location x 0 using temperature data at instrumented sites, denoted with {z (x i ) , i = 1, 2, ....N }, the temperature vector measured at the N sites x i , two different classes of interpolators were used: univariate methods and variable-aided interpolation (VAI) methods.While the former class takes into account only the data and the spatial coordinates x i , the latter also uses supplementary data, such as elevation q (x i ), and other physiographic variables.These methods will be shortly described in the following subsections.

Univariate Interpolators
The radial basis function [32] represents a large family of exact interpolators similar to those used in geostatistical interpolation (see further), but without the benefit of a prior analysis of spatial correlation.These interpolators, which do not make any assumption on the input data and provide excellent interpolators for a wide range of data, have been used in many disciplines.The simplest variant of this method can be viewed as a weighted linear function of (inverse) distance from grid point to data point, plus a "bias" factor.The temperature at an ungauged site x 0 can be estimated by: where φ (ρ) is the selected radial basis function, ρ is the radial distance from point x 0 to the i-th data point x i (ρ = ||x i − x 0 ||) and λ i are the weights to be estimated together with the bias value µ (or Lagrangian multiplier).These weights are given by the solution of a system of (n + 1) linear equations.
The radial basis function used in this work is the thin plate spline, which has the following expression: This interpolation method requires the choice of a minimum number of neighbor points.
The second univariate method taken into account here was the inverse distance weighting (IDW).The temperature is a linear combination of several surrounding observations, while the weights are proportional to the distance between observations and x 0 : where the weights λ i are expressed as a function of distance as follows: This exact interpolation method requires the choice of the exponent r and of a search radius R or the minimum number of points N.
The third univariate method taken into account, is OK, which belongs to geostatistical interpolators [33,34].This method uses a particular function, called the semivariogram, instead of simple Euclidean distance functions in order to measure the dissimilarity between observations.The experimental semivariogram is a function of both distance and direction.If anisotropic spatial patterns exist, different semivariograms should be computed along different directions.The semivariogram is used to derive the weights of an unbiased linear interpolator by means of the minimization of the error estimate variance.Thus, the OK can be labeled as a BLUE method (best linear unbiased estimator).The relationship between such weights and the semivariogram results: where γ is the semivariogram vector relative to each sample point and that to which the estimation is referred, Γ is the semivariogram matrix relative to sample points and λ is the weight vector relative to the point to be estimated.A detailed characterization of this topic can be found, for instance, in Kitanidis [34], Goovaerts [35].

Linear Models
The most simple method belonging to VAI methods is linear regression (LR).It consists of temperature estimation as a function of the elevation of the instrumented sites q (x 0 ), in the case of simple linear regression, or as a function of more than one physiographic variable, in the case of multiple linear regression (MLR).The equation used for simple and multiple linear regressions is: where v j (x 0 ) are the n v independent variables for the MLR (n v = 1 simple regression), while the regression coefficients α, β (or β j ), constant over the whole study area, can be estimated, using the least squares method, from the set of temperature and the independent variables data at N ungauged sites {z In order to assess regression coefficients, the standard OLS (ordinary least squares) methodology has been compared with the results provided by the robust regression method (ROB) [36].Physiographic and geographic parameters used in MLR were the elevation q, the latitude L, the minimum absolute distance from the sea, d min , the angle α formed by the minimum distance vector and the south, the distance d i from the sea in the eight cardinal directions i, the azimuthal angle β i of the horizon in the eight cardinal directions and three indices D s , A s , C, as previously defined in Equations ( 1)-(3), respectively.

Geographically Weighted Regression
Geographically Weighted Regression (GWR) [37] considers the variability of coefficients α and β of a regression expression linking temperature to the topographic features allowing these two parameters to change over space using the following relationship: Each coefficient may be thought of as a 3D surface over the geographical study area rather than a single constant value.The estimation of the two coefficients α (x 0 ) and β (x 0 ) can be carried out by one of the two following approaches: GWR with fixed spatial kernels and GWR with adaptive spatial kernels.The latter approach, used in this study, is based on a circle of radius R drawn around a point to calibrate an ordinary least squares regression model on the basis of the observations that fall within this circle; the coefficients α and β are an estimation of the local coefficients at the point x 0 .The spatial kernel can be adapted depending on the density of the data [38].

Artificial Neural Networks
ANNs have been extensively applied in the past decade for estimating and forecasting hydrological variables [39,40].In this work, the MLP method has been applied.An MLP is a feedforward neural network consisting of a number of units (perceptrons or neurons), which are connected by weighted links.The units are organized in layers: an input layer, with a number of neurons defined by the input independent variables, an output layer, with a number of neurons defined by the dependent variable, and, finally, one or more hidden layers.In an MLP, an arbitrary input vector is propagated forward through the network, where the hidden layers of neurons make a linear combination of input signals and convert it through a generally nonlinear function (activation function).Each layer output forms the input to the following layers.
MLP was used like a VAI technique to estimate temperature values at ungauged sites using the elevation information, as well.The implemented MPL consists of three layers: the input layer with three neurons, the output layer with one neuron and one hidden layer.
ANN techniques require a training stage, to learn the dynamics of the causal relationship.In the training phase of an MLP, known input-output pairs are presented to the network, and the weights are updated by following pre-determined learning rules.In our work, the "error backpropagation rule" (EBP) was used.

Use of Residual Spatial Interpolation (Residual Kriging)
The above-mentioned VAI methods (LR, MLR, GWR and ANN) can provide, for each one of the N gauged sites, a residual error defined as the difference between the estimated value ẑA (x i ) and the observed value z(x i ): The spatial correlation of residuals r A (x i ) can be taken into account using a geostatistical algorithm denoted as residual kriging (RK).In this case, the four methods estimate large-scale structures of temperature data, valid for the whole domain, and then, the interpolation of residuals r A (x i ), carried out with the OK method, produces a map of r A (x i ) representing the corrections to apply to each one of the three models.Final estimates at an ungauged site x 0 are produced as a sum of the VAI method estimates and ordinary kriging estimates of residuals: This approach has been extensively used for mapping climatic variables [21,30,41].

Fitting Monthly Temperature by Fourier Series
Another method, based on the seasonal scaling of annual temperature, was implemented in the case of monthly analysis.This method, described in Claps et al. [16], allows one to derive the monthly temperature by means of the estimation of the whole parameterized curve of the temperature regime at each point over the domain.With such an approach, it is possible to describe with a single model the causal relationship between physiographic and geographic predictors and the seasonality of temperature in Mediterranean climates.Starting from estimates of the spatial variability of the mean annual temperatures, the monthly deviations from the annual mean can be analyzed, considering the zero-mean temperature regime t (j) given by this expression: where j denotes the month of the year, T(j) is the average temperature at month j and T a is the mean annual temperature.The sequence of 12 monthly temperature deviations t (j) can be well approximated by means of sinusoidal curves obtained by Fourier series using the following expression: where A 0 is the mean of t (j) (here, equal to zero), τ, here equal to 12, is the fundamental period of the cycle, N is the number of the harmonics, A i is the amplitude, φ i is the phase and τ i is the period of the i-th harmonic.For the estimation of the Fourier series parameters, the cosine argument in Equation ( 14) can be decomposed to a polynomial form: where The amplitude and phase of the i-th harmonic can then be obtained as:

Evaluation of Different Algorithms
For a general comparison among methods, it was not possible to use the geostatistical cross-validation [33] because of the presence of global interpolation methods, such as LR and GWR.Therefore, the comparison criteria were based on the following different indices used to measure the strength of the statistical relationship between estimated ẑ (x i ) and measured z (x i ) temperatures in the N v points of the validation subset.
(1) The RMSE (root mean square error of the prediction), which measures the average square difference between the true temperature and its estimate at the validation points: (2) The MBE (mean bias error) or simply bias: (3) The MAE (mean absolute error): (4) The CC (linear correlation coefficient): where z m and ẑm are the mean of measured and estimated temperature values, respectively, and σ e and σ o are the standard deviation of measured and estimated temperature values, respectively.Another tool used to compare the different methods is the Taylor diagram [42].This tool is based on a trigonometric similitude between the standard deviation of series (SD), CC and centered pattern root mean square difference (RMSD), and it allows the measuring of estimate performances, in terms of RMSD, given SD and CC values calculated between estimates and measurements for the validation subset.

Univariate Methods
The application of the first univariate method, RBF with a thin plate spline, has been performed by varying the minimum number of points N. Table 1 shows the results of the different trials carried out on the validation set to get the best parameter choice for all univariate methods.The best result, in terms of RMSE and CC, is obtained for N = 30, even if the relative MBE value is slightly worse than that obtained for N = 10.
The estimate made by the second method, IDW, depends on the selection of the exponent r and the neighborhood search strategy.As can be seen in Table 1, several attempts were done to get the optimal combination of these settings.These trials were carried out by fixing the exponent r and changing the minimum number of points N and the search radius R. The best performance was achieved for r = 2 and R = 40 km, both in terms of RMSE and bias.Instead, the best result, in terms of CC, is obtained for r = 2 and N = 5.The bias of RBF and IDW is negative, pointing out that these univariate methods underestimate the annual temperature at the validation sites.For the OK method, the possible presence of anisotropic patterns was explored, despite the limited number of available stations.Indeed, the experimental semivariogram is a function of both distance and direction, so it can account for anisotropic patterns revealed by the data.In particular, it was observed that the sill varies along different directions, determining the phenomenon known as zonal anisotropy [35].The zonal direction, i.e., the direction of maximum variability, was found at 20 degrees measured clockwise from the north (Figure 2).
The theoretical zonal anisotropic semivariogram was operationally modeled by summing an isotropic semivariogram, given by the first term, to a geometrical anisotropic semivariogram with a small anisotropy ratio (according to the procedure suggested for the GSTAT software; see Pebesma and Wesseling [43]).With regard to the average annual temperature data, the theoretical semivariogram, calibrated with a manual procedure, assumes the following form: γ = 1.3 Sph(25000) + 1.7 Gau(2.5 × 10 4 , 110, 0.0001) (22) In particular, 1.25 • C 2 is the partial sill of the theoretical semivariogram at the principal direction (called c 1 , in GSTAT nomenclature); 1.75 • C 2 is the value that was summed to c 1 to obtain partial sill in the zonal direction (called c 2 ); 25, 000 m is the maximum range of the variogram model (called a 1 ), 1.25 × 10 4 m is the maximum range amplified (called a 2 ); 110 degrees is the angle for the principal direction (called p); 0.0001 is the range ratio (i.e., the ratio between the minor range and the maximum range, called s).It has been decided not to consider the nugget effect, because information about possible sensor errors was not available.Furthermore, data have been preprocessed and possible data anomalies checked.In order to model possible high variability in the short range, the spherical model (Sph), almost linear at its origin, has been selected.The additive model used for the anisotropic formulation is the Gaussian model (Gau), varying very smoothly at the origin and allowing for a gradual development with respect to the principal direction semivariogram.Performance indices (RMSE, MBE and CC) of OK are comparable to those obtained with the best IDW trial, but worse than those obtained with RBF, which can be definitely considered the best method, among the univariate, in terms of accuracy, bias and correlation for annual temperature field.

Variable Aided Methods
The results of the application of VAI algorithms for the spatial interpolation are reported in Table 2.The ordinary least squares (OLS) and the robust (ROB) regression methods were used for regression coefficient estimation.The first of these provides the best results in terms of both MBE and RMSE, but the differences in terms of accuracy and unbiasedness between the two methods are negligible, suggesting the absence of outliers in the temperature dataset.The second VAI method (GWR) was applied adopting a searching strategy based on the number of closest points to be considered in the estimate.The best performance, in terms of accuracy and correlation, was obtained for N greater than 20, while, in terms of bias, for N equal to 15.In general, the best results of GWR are comparable to those of LR.
The third VAI method is the multiple linear regression (MLR), which estimates temperature by exploiting its dependence on geographic and physiographic features.Once the indices listed in Section 2.4 were derived, a stepwise regression procedure was applied to the training set to build the model that links the mean annual temperature to the morphological parameters.This involves the generation of multiple regression models of increasing complexity, according to the number of independent variables considered.These models have been compared and evaluated using the coefficient of determination R 2 , the RMSE and the test on coefficients significance of T stat, given by the ratio between R 2 and the standard error.The best results in terms of RMSE and R 2 were obtained by the use of q (elevation), L and D s as independent variables, but the value of T stat for L suggests neglecting the latter variable.Finally, the best model for mean annual temperature is only based on elevation q (m a.s.l.) and the geometric mean of the distance from the sea in the eight cardinal directions D s (m), as follows: The performance indices provided by this method are satisfactory, both in terms of RMSE (lower than for GWR and LR) and for MBE.Moreover, the CC value, equal to 0.96, confirms an excellent agreement between estimates and observations.

Artificial Neural Networks
The results of the application of ANNs are reported together with those of VAI methods in Table 2.The ANN was trained by the scaled conjugate gradient algorithm, and the initial weights were randomly extracted from a standardized normal distribution.The learning rate decreased exponentially from 0.01 to 0.0001.Six network architectures, having different numbers of neurons in the hidden layer, were compared, and on the basis of RMSE and MBE performances, the best resulted in being a network with one hidden layer, having four neurons with a hyperbolic tangent activation function (MLP 3-4-1).In terms of RMSE, the ANN performances overcame those of LR, MLR and GWR, while the MBE value was slightly worse than that provided by MLP 3-8-1.Moreover, the correlation coefficient between estimated and measured data for the validation set was equal to 0.97, confirming an excellent agreement within the validation dataset.This confirms the ANN as a suitable algorithm to estimate nonlinear processes, such as temperature, as well as other hydroclimatic variables.

Residual Kriging
The correlation coefficient between measured data and residuals provided by LR, equal to −0.44, highlights that not all of the spatial data correlations can be explained by LR methods, and a similar observation can be done for the MLR and MLP with four neurons in the hidden layer, whose correlation coefficient between measured data and residuals is equal to −0.4 and −0.35, respectively.This highlights that the residuals obtained by VAI methods still show a spatial correlation that can be exploited to improve the variable estimation.Starting from these considerations, residual kriging (RK) was used as a further VAI method by applying OK to the residuals derived from LR, MLR, GWR and ANN (MLP 3-4-1).In this case, an isotropic behavior in the residuals coming from the four different methods was observed when deriving the experimental semivariograms.As previously explained, the estimated values of residuals were added to the LR, MLR, GWR and ANN, respectively.In terms of accuracy and correlation, the performances of these methods are better than the univariate method and the other VAI methods (Table 2), while the bias is comparable to that of the other VAI methods.The best method, in terms of RMSE, MBE and CC, appears to be the RK-LR (with OLS).

Monthly Analysis
The interpolation methods with the best performance, described for the case of average annual temperature data, have been also applied to mean monthly temperature.Particular attention was paid to the estimation of the semivariograms of both the average monthly data (i.e., OK) and the residuals coming from the use of four VAI and ANN methods (i.e., LR, MLR, GWR and MLP 3-4-1).In this regard, the presence of a zonal anisotropy was observed in mean monthly data for all months with the exception of the summer months (JJA).Moreover, for the mean monthly temperature, another method has been taken into account: the Fourier series with the estimation of parameters by an MLR method.In particular, the first attempt to reconstruct the zero-mean temperature regime was made with a one-harmonic (N = 1) Fourier series (F1H), with τ 1 = 12 months.The parameters A 1 and φ 1 of the Fourier equation were then correlated with the geographical and physiographic descriptors (q (m a.s.l.), L (hexadecimal degrees), D s (m), A s (m −1 ), C), defined before, using the stepwise procedure to check the presence of any statistically significant relationship.Results of this multi-regressive analysis point out that the most efficient models found for amplitude A 1 and phase φ 1 are: Despite fair model results, significant errors occur in correspondence with the pairs of consecutive months with either the lowest or highest temperature (December and January, and July and August, respectively).In order to improve the temperature cycle reproduction and to avoid this problem, a second Fourier wave (F2H) with τ 2 = 6 months was used.In this case, least-squares regressions of the four parameters for all stations produced estimates of A 1 and φ 1 almost identical to those obtained considering only one harmonic.The analysis of the second harmonic parameters showed that A 2 and φ 2 can be considered constant in space, with an average value of 0.12 and 2.03, respectively.
Finally, monthly mean temperatures can be reconstructed using the following combined model: where T a is the mean annual temperature, obtained with one of the above-mentioned methods.In this application, T a is the estimated mean annual temperature obtained by RK-ANN (MLP 3-4-1).
In Table 3, an overview of the results is shown for monthly mean temperature.The best method is given for each month together with the corresponding value of the relative statistical indices.For RMSE values, the best results were always obtained with a VAI method.In particular, in seven of the 12 months, the lowest value of RMSE was achieved using the two-step VAI method, i.e., the Fourier series with the estimation of parameters by an MLR approach (F2H).For the other five months, the lowest value of RMSE was given by two-step VAI methods (RK-OLS, RK-MLP, RK-GWR).Differently from what was obtained at the annual scale, the best results in terms of unbiasedness were achieved with the VAI methods (direct application or two-step).Therefore, on the basis of these considerations, the F2H was chosen as the interpolator to be used to create the mean monthly air temperature maps.

Discussion and Application of the Best Interpolation Method
The performances of the best methods at the annual scale, respectively in terms of RMSE and MBE, for both univariate and VAI methods are shown in Figure 3. RBF with N = 30 can be considered the best univariate method.IDW skill has the worst performances, yielding higher RMSE and MBE than other univariate methods, while OK has intermediate skills between IDW and RBF.However, all univariate methods are very biased, providing a systematic underestimation of temperature.In order to improve the performance, it is necessary to switch from univariate methods to the VAI ones, by introducing the elevation information, in the case of LR, GWR and ANN methods, or by introducing elevation and physiographic information, in the case of ML.Indeed, from a comparison of the performance indices in Tables 1 and 2, it can be observed that with the VAI methods, RMSE and MBE subside dramatically.In particular, while the LR is characterized by the lowest bias, but relatively high RMSE, methods, such as GWR, ANN and MLR, provide good results, with acceptable values of RMSE and with MBEs substantially higher than those obtained with LR.The best results, in terms of RMSE, are achieved by RK and, in particular, RK-LR, which gives the lowest RMSE and the highest CC.In general, the application of OK to the residuals resulting from any VAI method remarkably decreases the RMSE of the underlying VAI method, as can be observed also in Figure 3a.However, taking into account the values of the two indices, RMSE and MBE, at the same time, one can note that a simple method such as LR (ROB), characterized by an RMSE value 36% higher than for RK -LR, has an improvement, in terms of MBE, of about 90% if compared to the MBE value of RK-LR.The values of CC are comparable.On the basis of the previous considerations, both LR and RK-LR (with OLS) could be chosen as the best method and used for spatial interpolation of the mean annual temperature.The mean annual temperature map obtained by RK-LR is shown in Figure 4. Regarding the analysis at the monthly scale, the spatial variation of the cyclical temperature pattern was found to be linearly dependent on the distance from the sea and latitude, in addition to elevation.The parameters of the two-harmonic Fourier series reproducing the temperature regime have been related to the above-mentioned physiographic predictors through a linear multivariate model, without the need to build 12 different models, one for each month.The results of the regression analysis clearly show that the amplitude and phase of the temperature cycle are affected by elevation and distance from the sea and also by latitude (only amplitude).Different spatial dependencies had been found by Claps et al. [16], who applied the procedure over the entire Italian territory, pointing out another advantage of this method: it allows one to easily assess the influence of the various parameters at different spatial scales.
Retrieved standard deviation (SD) and correlation coefficient (CC) values were used as polar coordinates on the Taylor diagram (Figure 5), which displays the relationships of these values with the centered pattern error component, therefore not considering the bias component.OK and IDW have very similar performances, while RBF shows better performances in terms of centered root-mean-square difference (RMSD).VAI methods are clustered, with definitely better skill than the univariate methods thanks to the higher CC and SD values, at least in two of the three univariate methods, closer to those of observations.Standard deviation values for the two groups seem related to their mean CC; moreover, it seems that a minimum RMSD level related to the corresponding mean CC is reached by both groups.This effect could be due to the calibration operations performed for each interpolation method by means of parameter selection.

Conclusions
The availability of climatic maps, a key issue for agricultural, hydrological and environmental management, has increased in the last few decades thanks to the spread of new and efficient spatial interpolation techniques and GIS tools.Due to the relative abundance of methods, many algorithms are currently applied, and investigations continue, aiming at the definition of the "best" method for each single climatic variable.This study calibrated and compared nine different methods for spatial interpolation of temperature at the annual and monthly scale over Sicily, a Mediterranean island and a region with complex topography.
Different families of methods were considered: univariate, multivariate, artificial neural networks, a model to describe the cyclical nature of monthly temperature and a two-step application of methods.This comparison highlighted that among the univariate methods, the best performance, in terms of accuracy and unbiasedness, was obtained with RBF, performing better than the geostatistical method OK.Indeed, although RBF is a simple method that does not take into account any external variable, it yielded quite low RMSE (1.28 • C) and bias (−0.21 • C).
Several methods were implemented that take into account elevation and other geographic and physiographic parameters, from the simpler deterministic methods (i.e., linear simple or multiple regression and geographically-weighted regression) to more sophisticated ones.In particular, a reduction of 74.4% of the RMSE is possible using one of the best variable-aided interpolation methods instead of the best univariate one.The introduction of geographic and physiographic variables improves the interpolation performances, highlighting the importance of this information for modeling temperature in a complex morphology region, such as Sicily.On the whole, the best results, for mean annual temperature were achieved by both the application of a geostatistical method to the result of a deterministic one (the RK based on LR) and by the use of a simple method, such as LR.
At the monthly scale, the best results in terms of accuracy and unbiasedness have been obtained by the F2H application, which also takes into account the contribution of a second six-month harmonic.These results open the way for the implementation of a very efficient two-step interpolation technique for monthly temperatures resulting from the interpolation at the annual scale, carried out with linear regression and ordinary kriging of residuals, and the eventual monthly downscaling of annual results by the Fourier series method.
Finally, the results lead to the elaboration of reliable temperature maps, which can significantly contribute to the proper application of hydrological modeling, e.g., for water management, for the monitoring and management of the hydrogeological risk and for the climatological characterization of the territory.

Figure 1 .
Figure 1.One hundred-meter DEM of Sicily; elevation histograms, based on DEM, and average mean temperature histogram, based on station records, are reported next to the legend and on the bottom-right, respectively.

Figure 2 .
Figure 2. Empirical and theoretical semivariograms of mean annual temperature in the zonal direction (20 • ) and in the principal direction (110 • ).

Figure 3 .
Figure 3. (a) RMSE of the best univariate and VAI interpolation methods; (b) mean bias error (MBE) of the best univariate and VAI interpolation methods.

Figure 4 .
Figure 4. Mean annual temperature interpolated using residual kriging with linear regression (RK-LR) z-q method.

Figure 5 .
Figure 5.Taylor diagram for mean annual temperature interpolation models.The label OBS is used to represent the reference data (i.e., observations) to which estimator performances are referred.

Table 1 .
Comparison of goodness of interpolation for the univariate methods applied to mean annual temperature.

Table 2 .
Comparison of indices of interpolation goodness of the variable-aided (VAI) interpolation methods applied to mean annual temperature.

Table 3 .
Results of monthly analysis; each box reports the interpolation model leading to the best result for each index, and the corresponding value of the index.