Abstract
Moran’s I is a spatial autocorrelation measure of univariate spatial data. Therefore, even if p spatial data exist, we can only obtain p values for Moran’s I. In other words, Moran’s I cannot measure the degree of spatial autocorrelation of multivariate spatial data as a single value. This paper addresses this issue. That is, we extend Moran’s I so that it can measure the degree of spatial autocorrelation of multivariate spatial data as a single value. In addition, since the local version of Moran’s I has the same problem, we extend it as well. Then, we establish their properties, which are fundamental for applied work. Numerical illustrations of the theoretical results obtained in the paper are also provided.
MSC:
62H11; 05C50
1. Introduction
Spatial autocorrelation is a notion that describes the similarities/discrepancies between data at different vertices/spatial units. It is fundamental to spatial science, which includes spatial statistics, spatial econometrics, geographical analysis, and so on. Many measures of it have been proposed. For a historical overview of spatial autocorrelation, see, e.g., Getis []. Among them, Moran’s I is the most prominent spatial autocorrelation measure and was developed by Moran [] and Cliff and Ord [,,,]. Roughly speaking, like Pearson’s sample correlation coefficient, a positive (respectively, negative) Moran’s I indicates the presence of positive (respectively, negative) spatial autocorrelation. (However, unlike Pearson’s sample correlation coefficient, its range is not necessarily . As will be shown later, it depends on the spatial weight matrix. See also de Jong et al. [] and Maruyama [].) Later, Anselin [] developed a local version of Moran’s I: local Moran’s I. To distinguish them, Moran’s I is sometimes referred to as global Moran’s I.
Moran’s I is designed to measure the spatial autocorrelation of univariate spatial data. Therefore, even if p spatial data exist, we can only obtain p values of Moran’s I. In other words, Moran’s I cannot measure the degree of spatial autocorrelation of multivariate spatial data as a single value. In this paper: (i) We address this issue. That is, we extend Moran’s I so that it can measure the degree of spatial autocorrelation of multivariate spatial data as a single value. (ii) In addition, since the local version of Moran’s I has the same problem, we extend it as well. (iii) Subsequently, we establish their properties, which are fundamental for applied work. (iv) Numerical illustrations of the theoretical results are also provided.
Here, we discuss existing research related to this study. In addition to the papers listed above, the following papers are closely related to this paper: Wartenberg [], Anselin [], Lin [], and Yamada [,]. First, Yamada [] presented several results on univariate global Moran’s I. This paper depends on them. Second, Yamada [] dealt with the multivariate extension of Geary’s c, which was developed by Geary [] and modified by Cliff and Ord [,,,]. Thus, the present paper can be seen as a companion paper to it. It should be noted that the multivariate local Geary’s c was developed by Anselin []. Third, the relevance of Wartenberg [] as well as Lin [] to this paper is discussed in the second section from the end. This is mainly because we want to use our notations as following to describe their studies.
The paper is organized as follows. In Section 2, we sketch how Moran’s I is extended in this paper. In Section 3, we provide some preliminaries. More specifically, after stating the multivariate spatial data that will be considered in the paper, we review global and local Moran’s I for univariate spatial data. In Section 4, we define two new measures, i.e., multivariate global and local Moran’s I, and establish their properties. Section 5 provides numerical illustrations of the theoretical results obtained in Section 4. Section 6 clarifies the relationship between our multivariate global Moran’s I and Wartenberg’s [] spatial correlation matrix. Section 7 concludes the paper.
2. A Sketch of How Moran’s I is Extended
In this section, we sketch how Moran’s I is extended in this paper.
Let denote a realization of a single variable y at the vertex/spatial unit for . Moran’s I uses the product given by
for , where , and s is the positive square root of . Roughly speaking, like the sample correlation coefficient, a positive (respectively, negative) global Moran’s I indicates the presence of positive (respectively, negative) spatial autocorrelation.
Let denote a realization of a multivariate vector at the vertex/spatial unit for . We ask how we can measure the similarity/discrepancy between and , which are both p-dimensional column vectors. A natural approach is to extend (1). That is, it can be measured with the inner product given by
for , where , and . Here, , and is the positive square root of for . We develop a spatial autocorrelation measure that uses (2).
3. Preliminaries
In this section, after clarifying the multivariate spatial data that will be considered in the paper, we review global and local Moran’s I for univariate spatial data.
Before we do this, we introduce some notation. Let be the identity matrix of order n, and let be the i-th column of . Let be the n-dimensional vector of ones, and let . Note that is a symmetric and idempotent matrix, i.e., and .
3.1. Multivariate Spatial Data
Following de Jong et al. [], we treat the problem of spatial autocorrelation in terms of a graph. Let denote a directed/undirected graph with n vertices. In addition, denote its vertex set by , where . For , let
and . We assume that for , and accordingly, is a hollow matrix by assumption. In addition, we assume that . Then is a nonzero matrix. For example, when and , its binary weight matrix (adjacency matrix) is
(Note that the corresponding graph is shown in Figure 1. The edge between and is undirected because both and belong to E.)
Figure 1.
A graph consists of 4 vertices.
As illustrated above, is not necessarily symmetric. However, for any , given that , it follows that , where . Note that is symmetric even though is not symmetric. (If is an undirected graph, then is identical to . We provide such a in Section 5.2) Accordingly, given that , it follows that . Moreover, as for by assumption, .
Let
Recall that in (3) denotes a realization of a multivariate vector at the vertex/spatial unit for .
Let
where
Then, by construction, is related to as
which appears in (2). Accordingly, it follows that
In addition, is related to as
3.2. Global Moran’s I for Univariate Spatial Data
Denote the global Moran’s I for a univariate spatial data, , by :
where .
Then, as shown in, e.g., de Jong et al. [], Dray [], Maruyama [], Murakami and Griffith [], and Nishi et al. [], can be expressed in matrix notation as
in (9) can also be represented by using as
Accordingly, given that , in (11) can be expressed in matrix notation as follows.
Incidentally, given that and , (12) can be obtained directly from (10).
Denote a spectral decomposition of a real symmetric matrix by
where , and . Then, given that , we can let . With respect to the other eigenvalues, let be in ascending order.
We document two known results with respect to .
Proposition 1.
(Equation (19) of de Jong et al. [], Theorem 2.1 of Maruyama []). It follows that . when such that . Likewise, when such that .
Proof.
Omitted. □
Remark 1.
(i) Although the range of Pearson’s sample correlation coefficient is , that of Moran’s I is , which depends on . (ii) From Poincaré’s separation theorem Rao [] (p. 64), Scott and Styan [] (Theorem 1), Abadir and Magnus [] (p. 347), Seber [] (p. 113), it follows that
where are the eigenvalues of in ascending order. Accordingly, it follows that
Incidentally, if is an undirected bipartite graph whose weight matrix is binary, then . See, e.g., Bapat [] (Lemma 3.13) and Estrada and Knight [] (p. 68).
Denote in (10) by . Then, from Yamada [] (Proposition 1(b)), it follows that
In addition, let , where for . Then, for and .
Proposition 2.
(Proposition 1(a) of Yamada []). in (9) is a weighted average of as follows.
Proof.
Omitted. □
3.3. Local Moran’s I for Univariate Spatial Data
Denote the local Moran’s I of a univariate spatial data by :
Then, in (18) can also be represented using as
Note that is constructed so that
We can confirm (20) as follows.
Given that and , in (19) can be represented in matrix notation as
4. Moran’s I’s for Multivariate Spatial Data
In this section, we newly introduce multivariate global and local Moran’s I and establish their properties.
4.1. Global Moran’s I for Multivariate Spatial Data
We define the following measure as the global Moran’s I for multivariate spatial data , which are p-dimensional column vectors:
which we refer to as “multivariate global Moran’s I” or simply “multivariate Moran’s I”. Note that the second equality in (23) follows from (7). When , given that reduces to for , it follows that . See (11). Thus, the multivariate Moran’s I given by (23) is a generalization of the conventional Moran’s I for univariate spatial data.
Let , which is the simple average of the univariate global Moran’s I’s, . Then, I has the following property:
Proposition 3.
I in (23) is equal to .
Proof.
Given that for , . Accordingly, it follows that
□
For any , given that , it follows that
We use (24) to derive the following results.
Proposition 4.
I in (23) can be represented compactly in matrix form as
Proof.
Remark 2.
Regarding Proposition 4, we make three remarks: (i) (25) is useful to calculate I. (ii) Given (24), it immediately follows that
which are representations corresponding to those in (12). Using (26), we can give another proof of Proposition 3 as follows.
(iii) A MATLAB/GNU Octave user-defined function for calculating I in (25) is as follows:
- function I = calc_I(Y,W)
- p = size(Y,2); Z = zscore(Y,1); q = sum(sum(W));
- I = trace(Z′ ∗ W ∗ Z)/(p ∗ q);
- end
Note that as is a special case of I, the function can also be used for obtaining .
The following result shows the bounds of I.
Proposition 5.
It follows that . when such that . Likewise, when such that .
Proof.
The proposition immediately follows from Propositions 1 and 3. □
Remark 3.
Regarding Proposition 5, we make two remarks: (i) As in the case of global Moran’s I for univariate spatial data, the bounds of global Moran’s I for multivariate spatial data also depend only on the structure of the graph represented by . (ii) From Proposition 5 and Equation (15), it follows that
Proposition 6.
Proof.
From Propositions 2 and 3, it follows that
Here, because for . In addition, since , it follows that
□
Remark 4.
Regarding Proposition 6, we make two remarks. (i) Proposition 6 is a generalization of Yamada [] (Proposition 1(a)). (ii) Given (16), the distribution of represents the spatial autocorrelation structure of the multivariate spatial data .
4.2. Local Moran’s I for Multivariate Spatial Data
Then, as we defined I from , we define the multivariate local Moran’s I from as follows.
which we refer to as the “multivariate local Moran’s I”. When , given that for , it follows that
Thus, the multivariate local Moran’s I given by (30) is a generalization of the local Moran’s I for univariate spatial data. In addition, it follows that
Let , which is the average of the univariate local Moran’s I’s, . Then, we have the following results.
Proposition 7.
in (31) is equal to for .
Proof.
Given , it follows that
□
The following result is useful to calculate .
Proposition 8.
in (31) can be represented compactly in matrix form as
Proof.
From Proposition 7, Equation (21), and , it follows that
□
Remark 5.
Regarding Proposition 8, we make two remarks: (i) We can give another proof of based on (33) as follows. From (33), since is the -entry of , it follows that
(ii) A MATLAB/GNU Octave user-defined function for calculating in (33) is as follows:
- function Ii = calc_Ii(Y,W,i)
- p = size(Y,2); Z = zscore(Y,1); q = sum(sum(W));
- Ii = (W(i,:) * Z * Z(i,:)’)/(p * q);
- end
Note that as is a special case of , the function can also be used for obtaining .
5. Numerical Illustrations
In this section, we provide numerical illustrations of the theoretical results obtained in the previous section. Before showing these, we introduce a table of Moran’s I’s.
5.1. Table of Moran’s I’s
Table 1 tabulates Moran’s I’s. In the next subsection, we make a table from two sets of generated multivariate spatial data. and I in the table are the measures that are newly defined in the paper. Recall that is the multivariate local Moran’s I at vertex and I is the multivariate global Moran’s I. They are located in the last column of the table. In this sense, the contribution of this paper can be expressed as the addition of a final column to this table. Again, recall that denotes the local Moran’s I for variable h at vertex and denotes the Moran’s I for variable h for and . As shown, among the measures, there are the following relations:
Table 1.
Table of Moran’s I’s.
5.2. Numerical Illustrations
For numerical illustrations of the theoretical results obtained in Section 4, consider the undirected graph shown in Figure 2. It is a two-dimensional square lattice graph, which is a Cartesian product of two path graphs. We suppose that it has a binary weight matrix, and accordingly, its is given by
In this case, is equal to 24. Note that since in (39) is symmetric, it is identical to . Incidentally, Moran’s [] is a univariate global Moran’s I for a two-dimensional square lattice graph. For details, see Yamada [].
Figure 2.
A two-dimensional square lattice graph.
We generate by
where , , , , and . Recall that and are eigenvectors of . Note that when is given by (39), both for and I belong to . In addition, given that , in (40) does not affect Moran’s I’s even though it does affect .
Figure 3 (respectively, Figure 4) depicts the heatmaps of for , which are generated by (40) and (41) when (respectively, ). In both figures, Panel A plots . Likewise, Panels B, C, and D respectively plot , , and . As expected, in both cases, the spatial autocorrelation structure of (respectively, ) is (respectively, not) the clearest.
Table 2 (respectively, Table 3) is the table of Moran’s I’s corresponding to the data depicted in Figure 3 (respectively, Figure 4). From these tables, we can confirm that the relations in (35)–(38) hold. For example, in Table 2, is the average of , , , and , and is the average of , , , and . In addition, from the tables, it is observable that for and I certainly belong to .
Finally, we make a remark. From Propositions 1 and 5, for and I of (respectively, ) equal (respectively, ), which is the upper (respectively, lower) bound, regardless of if . Nevertheless, the values corresponding to are far from these values, whereas those of are close to these values. This is due to the fact that is more highly contaminated by noise than is.
6. Discussion
In this section, we clarify the relationship between our multivariate global Moran’s I in (23) and Wartenberg’s [] spatial correlation matrix . Wartenberg [] defined the following matrix in our notation:
and the analysis based on its spectral decomposition is called Moran component analysis (MCA) (Lin []). When , given that , reduces to in (12) as
From Proposition 4, our multivariate global Moran’s I in (23) is related to the eigenvalues of as follows:
where for are the eigenvalues of . Hence, I is identical to the average of the eigenvalues of . In this sense, our multivariate global Moran’s I can be regarded as a value obtained from the spatial correlation matrix, . Nevertheless, it should be emphasized that is a matrix and not a single measure of spatial autocorrelation of multivariate spatial data.
7. Concluding Remarks
Conventional Moran’s I cannot measure the degree of spatial autocorrelation of multivariate spatial data as a single value. To address this issue, we developed Moran’s I for multivariate spatial data. It can describe the similarity/discrepancy between vectors of data at different vertices/spatial units. In addition, since the local version of Moran’s I has the same problem, we extended it as well. Subsequently, we established their properties, which are fundamental for applied work. They are summarized in Propositions 3–8. We have also illustrated them numerically.
Finally, we make a remark. In this paper, we developed Moran’s I’s for multivariate spatial data . Although we did not impose a stochastic model on , it is of course an interesting research topic to investigate the distribution of multivariate Moran’s I’s when such an assumption is made. Such investigations could be done using Imhof’s [] method. However, this is beyond the scope of this paper.
Funding
The Japan Society for the Promotion of Science supported this work through KAKENHI (grant number: 23K01377).
Acknowledgments
The author thanks four anonymous referees for their valuable comments. The usual caveat applies.
Conflicts of Interest
The author declare no conflicts of interest.
References
- Getis, A. A history of the concept of spatial autocorrelation: A geographer’s perspective. Geogr. Anal. 2008, 40, 297–309. [Google Scholar] [CrossRef]
- Moran, P.A.P. Notes on continuous stochastic phenomena. Biometrika 1950, 37, 17–23. [Google Scholar] [CrossRef] [PubMed]
- Cliff, A.D.; Ord, J.K. The problem of spatial autocorrelation. In Studies in Regional Science; Scott, A.J., Ed.; Pion: London, UK, 1969; pp. 25–55. [Google Scholar]
- Cliff, A.D.; Ord, J.K. Spatial autocorrelation: A review of existing and new measures with applications. Econ. Geogr. 1970, 46, 269–292. [Google Scholar] [CrossRef]
- Cliff, A.D.; Ord, J.K. Spatial Autocorrelation; Pion: London, UK, 1973. [Google Scholar]
- Cliff, A.D.; Ord, J.K. Spatial Processes: Models and Applications; Pion: London, UK, 1981. [Google Scholar]
- De Jong, P.; Sprenger, C.; van Veen, F. On extreme values of Moran’s I and Geary’s c. Geogr. Anal. 1984, 16, 17–24. [Google Scholar] [CrossRef]
- Maruyama, Y. An alternative to Moran’s I for spatial autocorrelation. arXiv 2015, arXiv:1501.06260v1. [Google Scholar]
- Anselin, L. Local indicators of spatial association—LISA. Geogr. Anal. 1995, 27, 93–115. [Google Scholar] [CrossRef]
- Wartenberg, D. Multivariate spatial correlation: A method for exploratory geographical analysis. Geogr. Anal. 1985, 17, 263–283. [Google Scholar] [CrossRef]
- Anselin, L. A local Indicator of multivariate spatial association: Extending Geary’s c. Geogr. Anal. 2019, 51, 131–248. [Google Scholar] [CrossRef]
- Lin, J. Comparison of Moran’s I and Geary’s c in multivariate spatial pattern analysis. Geogr. Anal. 2023, 55, 685–702. [Google Scholar] [CrossRef]
- Yamada, H. A new perspective about Moran’s coefficient: Revisited. Mathematics 2024, 12, 253. [Google Scholar] [CrossRef]
- Yamada, H. Geary’s c for multivariate spatial data. Mathematics 2024, 12, 1820. [Google Scholar] [CrossRef]
- Geary, R.C. The contiguity ratio and statistical mapping. Inc. Stat. 1954, 5, 115–145. [Google Scholar] [CrossRef]
- Dray, S. A new perspective about Moran’s coefficient: Spatial autocorrelation as a linear regression problem. Geogr. Anal. 2011, 43, 127–141. [Google Scholar] [CrossRef]
- Murakami, D.; Griffith, D.A. Eigenvector spatial filtering for large data sets: Fixed and random effects approaches. Geogr. Anal. 2019, 51, 23–49. [Google Scholar] [CrossRef]
- Nishi, H.; Asami, Y.; Baba, H.; Shimizu, C. Scalable spatiotemporal regression model based on Moran’s eigenvectors. Int. J. Geogr. Inf. Sci. 2023, 37, 162–188. [Google Scholar] [CrossRef]
- Rao, C.R. Linear Statistical Inference and Its Applications, 2nd ed.; Wiley: New York, NY, USA, 1965. [Google Scholar]
- Scott, A.J.; Styan, G.P.H. On a separation theorem for generalized eigenvalues and a problem in the analysis of sample surveys. Linear Algebra Its Appl. 1985, 70, 209–224. [Google Scholar] [CrossRef]
- Abadir, M.K.; Magnus, J.R. Matrix Algebra; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
- Seber, G.A.F. A Matrix Handbook for Statisticians; Wiley: Hoboken, NJ, USA, 2008. [Google Scholar]
- Bapat, R.B. Graphs and Matrices, 2nd ed.; Springer: London, UK, 2014. [Google Scholar]
- Estrada, E.; Knight, P. A First Course in Network Theory; Oxford University Press: Oxford, UK, 2015. [Google Scholar]
- Yamada, H. A unified perspective on some autocorrelation measures in different fields: A note. Open Math. 2023, 21, 20220574. [Google Scholar] [CrossRef]
- Imhof, J.P. Computing the distribution of quadratic forms in normal variables. Biometrika 1961, 48, 419–426. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).