# Improvement of Spatial Autocorrelation, Kernel Estimation, and Modeling Methods by Spatial Standardization on Distance

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

- By taking into account neighborhood relations. We can use the direct neighborhood between the two points—or centroids in case of polygons—of the pair (in the Voronoi sense) by assigning 1 if the two points are neighbors, 0 if they are not. When focusing on adjacency or adjacency relationships, the length of the common edge between objects can be used, either Voronoi tessellation in the case of a point pattern, or the length of the boundary in the case of adjacent polygons.
- By taking into account the distance between objects (represented by points or centroids ${P}_{i}$). A distance function is used (Euclidean distance, Manhattan distance, distance along a valuated network, etc.). The distance is often limited to a maximum distance ($dmax)$, called bandwidth, beyond which the value is 0, meaning that there is no spatial dependence beyond this distance. This function can be polynomial, for example, $\mathrm{max}\left(0\text{},\text{}1-\frac{d{\left({P}_{i},{P}_{j}\right)}^{k}}{dma{x}^{k}}\right)withk=\frac{1}{2}$,$1,2$…-; Gaussian, for example, $\mathrm{exp}(-\text{}d{({P}_{i},{P}_{j})}^{2}/dma{x}^{2})$; sigmoid, etc. The maximum distance ($dmax$) can be set for all pairs or it can be dependent on a density related parameter. For example, $dmax$ can depend on the distance to the n-closest adjacent point to one of the points of the pair. It can be estimated by the range of the semi-variogram corresponding to the situation to be analyzed.

- Spatial kernel estimation (Kernel estimation and Kernel Density estimation) extends to dimension 2 of the principles of classic one-dimensional kernel estimation. When the variable is numerical, the spatial interpolation by kernel calculates, at each point of a grid, the average of the values weighted by a function (referred to as kernel) of the distance to the grid point for all objects located at a distance lower than a given bandwidth distance $dmax$ [16]. For example, commonly used kernel functions are linear function (e.g., $\left(dmax-d\right)/dmax$), quadratic function (e.g., ${\left(\frac{dmax-d}{dmax}\right)}^{2}$), or a Gaussian function (e.g., $\frac{1}{\sqrt{2\pi}}{e}^{-1/2{\left(\frac{dmax-d}{dmax}\right)}^{2}}$). When the variable is qualitative, the estimation of densities per kernel (kernel density estimation) consists of calculating, for each point of a grid, the weighted number of the objects located at a distance lower than a given distance $dmax$, each object being weighted by the kernel.
- Autoregressive spatial models (Autoregressive Regression, Simultaneous Autoregressive Regression, Conditional Autoregressive Regression, Generalized Additive Model, Structured Additive Regression) also use a spatial weight matrix constructed as for spatial autocorrelation indices [2,3,16,17,18]. For example, for autoregressive regressions, we have:$${z}_{j}={\displaystyle \sum _{k}}{x}_{jk}{\beta}_{k}+\rho {\displaystyle \sum _{i,\text{}i\ne j}}{w}_{ij}{z}_{i}+{\epsilon}_{j}\hspace{1em}\left(Z=X\beta +\rho WZ+\epsilon \right)$$
_{j}: for all individuals ${P}_{j}$, the sum ${\sum}_{i}}{w}_{ij$ of the weights of all its neighbors must then be equal to 1, and ${\sum}_{i}}{w}_{ij}{z}_{i$ is just a weighted mean. In this case the matrix $W$ is said to be standardized on the rows. If all weights are equal, this means adding the mean of the neighboring values to the model. The weights can also have an absolute influence. In this case, the more neighbors close to ${z}_{j}$, the higher the value of ${\sum}_{i}}{w}_{ij}{z}_{i$ is. A weighted sum of the neighbors’ values is added to the model and not a weighted average. - The geographically weighted regression (GWR) models also use a spatial weight matrix. Here, the model’s coefficients $\beta $ are allowed to vary according to the location, in order to adapt the model locally to local spatial variations; these models aim at estimating regression parameters locally.
- Standardization on the rows of the distance matrix $W$ (each weight being divided by the sum of the weights of its row) can also be used in the calculation of the Moran or Geary indices, which is equivalent to taking as an overall index the arithmetic mean of the local indices.

## 2. The Need of a Spatial Standardization

## 3. Methods

## 4. Example

#### 4.1. Spatial Autocorrelation Indices

#### 4.2. Spatial Kernel Interpolation

## 5. Discussion and Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Tobler, W. A computer movie simulating urban growth in the Detroit region. Econ. Geogr. Suppl.
**1970**, 46, 234–240. [Google Scholar] [CrossRef] - Shabenberger, O.; Gotway, C. Statistical Methods for Spatial Data Analysis; Chapman & Hall: London, UK, 2005. [Google Scholar]
- Souris, M. Epidemiology and Geography. Principles, Methods and Tools of Spatial Analysis; Wiley-ISTE: London, UK, 2019; Epidemiologie et Géographie, Principes, Méthodes et Outils de L’analyse Spatiale; ISTE: London, UK, 2019, pour la version française. [Google Scholar]
- Moran, P. The interpretation of statistical maps. J. R. Stat. Soc. Ser. B
**1948**, 10, 243–251. [Google Scholar] [CrossRef] - Geary, R.C. The contiguity ratio and statistical mapping. Inc. Stat.
**1954**, 5, 115–145. [Google Scholar] [CrossRef] - Anselin, L. Local indicators of spatial association—LISA. Geogr. Anal.
**1995**, 27, 93–115. [Google Scholar] [CrossRef] - Getis, A.; Ord, J.K. The analysis of spatial association by use of distance statistic. Geogr. Anal.
**1992**, 24, 189–206. [Google Scholar] [CrossRef] - Fotheringham, S.; Rogerson, P.A. The Sage Handbook of Spatial Analysis; Sage: London, UK; Los Angeles, CA, USA, 2009. [Google Scholar]
- Cliff, A.D.; Ord, J.K. The Problem of Spatial Autocorrelation; Scott, A.J., Ed.; Studies in Regional Science; Pion: London, UK, 1969; pp. 25–55. [Google Scholar]
- Cliff, A.D.; Ord, J.K. Spatial Processes: Models and Applications; Pion Limited: London, UK, 1981. [Google Scholar]
- Upton, G.J.G.; Fingleton, B. Spatial Data Analysis by Example; Wiley: New York, NY, USA, 1985. [Google Scholar]
- Anselin, L.; Bera, A.K. Spatial dependence in spatial regression model, with an introduction to spatial econometrics. In Handbook of applied Economic Statistics; Ullah, A., Giles, D.E., Eds.; Marcel Decker: New York, NY, USA, 1988; pp. 237–289. [Google Scholar]
- Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer Res.
**1967**, 27, 209–220. [Google Scholar] [PubMed] - Getis, A.; Ord, J.K. Local spatial statistics: An overview. In Spatial Analysis: Modeling in A GIS Environment; Longley, P., Batty, M., Eds.; John Wiley & Sons: New York, NY, USA, 1996; pp. 261–277. [Google Scholar]
- Droesbeke, J.J.; Lejeune, M.; Saporta, M. Analyse Statistique des Données Spatiales; Technip: Paris, France, 2006. [Google Scholar]
- Bowman, A.W.; Azzalini, A. Applied Smoothing Techniques for Data Analysis; Oxford University Press: London, UK, 1997. [Google Scholar]
- Dormann, C.; McPherson, J.; Araújo, M.; Bivand, R.; Bolliger, J.; Carl, G.; Davies, R.G.; Hirzel, A.; Jetz, W.; Kissling, W.D.; et al. Methods to account for spatial autocorrelation in the analysis of species distributional data: A review. Ecography
**2007**, 30, 609–628. [Google Scholar] [CrossRef] - Alagar, V.S. The distribution of the distance between random points. J. Appl. Probab.
**1976**, 13, 558–566. [Google Scholar] [CrossRef] - Lellouche, S.; Souris, M. Distribution of distances between elements in a compact set. Unpublished, manuscript in preparation.

**Figure 1.**Number of points in rings of increasing radius and same width, for points independently and uniformly distributed in a 2D space.

**Figure 2.**Distribution of inter-distances inside the unit circle (R = 1) for independently and uniformly distributed set of points. In red, the curve for the theoretical probability density function; in light blue, a simulated distribution from values generated by a homogeneous Poisson model with density $\rho =1500$ inside the unit circle.

**Figure 3.**Votes for Emmanuel Macron (%) at the second round of the presidential election in France, canton level (2017) (source: data.gouv.fr and Institut Géographique National-IGN).

**Figure 5.**Moran (

**left**) and Geary

**(right**) autocorrelation indices with bandwidth $dmax$ varying from 25 to 250 km. In yellow without SD-correction, in green with SD-correction.

**Figure 6.**Spatial kernel interpolation (Gaussian function, h = 200 km) applied to the votes for Emmanuel Macron at the second round of presidential elections in France (2017): (

**a**) Left map without SD-correction; (

**b**) Right map with SD-correction.

Bandwidth (km) | Number of Pairs | Uncorrected Moran Index | Z-Score | Standard Deviation | SD-Corrected Moran Index | Z-Score | Standard Deviation |
---|---|---|---|---|---|---|---|

25 | 12,804 | 1.43 | 138.78 | 0.0104 | 1.46 | 33.46 | 0.0433 |

50 | 34,012 | 1.01 | 165.45 | 0.0062 | 1.14 | 45.47 | 0.0251 |

75 | 62,930 | 0.73 | 156.81 | 0.0047 | 0.92 | 89.67 | 0.0101 |

100 | 101,544 | 0.55 | 145.10 | 0.0036 | 0.76 | 128.16 | 0.0059 |

150 | 206,304 | 0.35 | 134.88 | 0.0025 | 0.57 | 149.73 | 0.0038 |

200 | 337,288 | 0.24 | 124.62 | 0.0019 | 0.45 | 150.47 | 0.0029 |

250 | 484,394 | 0.18 | 113.65 | 0.0016 | 0.37 | 152.98 | 0.0025 |

300 | 644,016 | 0.14 | 110.83 | 0.0013 | 0.32 | 153.41 | 0.0020 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Souris, M.; Demoraes, F.
Improvement of Spatial Autocorrelation, Kernel Estimation, and Modeling Methods by Spatial Standardization on Distance. *ISPRS Int. J. Geo-Inf.* **2019**, *8*, 199.
https://doi.org/10.3390/ijgi8040199

**AMA Style**

Souris M, Demoraes F.
Improvement of Spatial Autocorrelation, Kernel Estimation, and Modeling Methods by Spatial Standardization on Distance. *ISPRS International Journal of Geo-Information*. 2019; 8(4):199.
https://doi.org/10.3390/ijgi8040199

**Chicago/Turabian Style**

Souris, Marc, and Florent Demoraes.
2019. "Improvement of Spatial Autocorrelation, Kernel Estimation, and Modeling Methods by Spatial Standardization on Distance" *ISPRS International Journal of Geo-Information* 8, no. 4: 199.
https://doi.org/10.3390/ijgi8040199