Abstract
This paper conducts a brief survey of spatial unit roots within the context of spatial econometrics. We summarize important concepts and assumptions in this area and study the parameter space of the spatial autoregressive coefficient, which leads to the idea of spatial unit roots. Like the case in time series, the spatial unit roots lead to spurious regression because the system cannot achieve equilibrium. This phenomenon undermines the power of the usual Ordinary Least Squares (OLS) method, so various estimation methods such as Quasi-maximum Likelihood Estimate (QMLE), Two Stage Least Squares (2SLS), and Generalized Spatial Two Stage Least Squares (GS2SLS) are explored. This paper considers the assumptions needed to guarantee the identification and asymptotic properties of these methods. Because of the potential damage of spatial unit roots, we study some test procedures to detect them. Lastly, we offer insights into how to relax the compactness assumption to avoid spatial unit roots, as well as the relationship between spatial unit roots and other models, such as the Spatial Dynamic Panel Data (SDPD) model and Lévy–Brownian motion.
Keywords:
spatial correlation; spatial unit roots; nonstationarity; spurious spatial regression; panel data MSC:
62-02
1. Introduction
There is an extensive literature using spatial statistics that deals with cross-sectional correlation, and is popular in regional science, urban economics and geography, to mention a few; see Anselin [1] for a nice introduction to this literature. Unlike time-series, there is typically no unique natural ordering for cross-sectional data. Spatial dependence models may use a metric of economic distance that provides cross-sectional data with a structure similar to that provided by the time index in time series. Examples in economics usually involve spillover effects or externalities due to geographical proximity. For example, the productivity of public capital, like roads and highways, on the output of neighboring states. Also, the pricing of welfare in one state that pushes recipients to other states. In a linear regression model, this spatial correlation may be in the disturbances and is called the spatial error model (SEM), or modeled on the dependent variable itself and named the spatial autoregression (SAR) model, or the spatial lag model. Unlike the autoregressive model or lagged model in time series where there is a natural ordering across time and lagged values are well defined, in cross-sections, this is dealt with using neighbors whose shocks or disturbances are affected by their neighbor’s shocks or disturbances in the SEM. For the SAR, the dependent variable is affected by neighbors, like house price being affected by neighboring house prices.This leads to the construction of a weight matrix that defines one’s neighbors by distance or contagion; see Anselin [1]. So, rather than lagged house prices as in an autoregressive time series model, own house price is related to a weighted average of neighboring house prices. Spatial dependence has been extended from cross-section to panel data; see Chapter 13 of Baltagi [2] or Elhorst [3] for a textbook treatment of the subject.
OLS yields inconsistent estimates in the SAR model due to the endogeneity of the spatial lagged dependent variable and the disturbances. For the SEM model, OLS yields an unbiased but inefficient estimator. Because of the limitation of OLS, (Q)MLE is often used to estimate the spatial models as in Ord [4], Anselin [1] and Lee [5]. However, MLE is sometimes computationally intensive especially for large sample sizes because of the requirement to compute the Jacobian term in the likelihood function. Ord [4] proposes a simplified computational procedure only requiring the eigenvalues of the spatial weight matrix, but computing accurate eigenvalues is increasingly difficult for large n. Kelejian and Prucha [6] suggest a Generalized Method of Moments (GMM) estimator for the SEM with SAR structure. Alternatively, constructing instrumental variables (IVs) from the exogenous variables, Kelejian and Prucha [7] propose the GS2SLS estimator and Lee [8] discuss the best GS2SLS estimator by using more efficient IVs. Lee [9] considers a GMM estimator for the SAR models with exogenoues variables and showed it is more efficient than the 2SLS estimator and is as efficient as the ML estimator asymptotically. Spatial panel data models (with dynamic terms) are also estimated using MLE, 2SLS or GMM as in Yu et al. [10], Baltagi and Liu [11], Kapoor et al. [12].
However, for all of these estimation methods considered above, they constrain the parameter space of the spatial coefficient to limit the degree of correlation between units. This is because when such spatial correlation is too strong, the spatial echoes passing through each unit do not die out and so the system cannot achieve an equilibrium. When the spatial weight matrix is row-normalized by convention, such a constraint requires the spatial coefficient to be smaller than 1 in absolute value. In this survey, we summarize the developments that relax this constraint and allow the spatial coefficient to equal or sufficiently approach unity, which is known as the spatial unit root. This is of practical relevance because there are many cases where the spatial coefficients are close to 1. For example, Keller and Shiue [13] detect the inter-regional trade of Chinese rice and find that rice prices for different provinces are highly related with spatial coefficients lying between 0.9 to 0.95.
When (near) spatial unit roots exist, the standard estimation procedures are not necessarily reliable and statistical inference is invalid. To remedy such a case, Fingleton [14] avoids the circularity of the spatial weight matrix and conducts a Monte Carlo simulation to explore the performance of OLS estimation. Alternatively, Lee and Yu [15] artificially let the spatial autoregression coefficient sufficiently approach the unit roots and derive the asymptotic behavior of QMLE and 2SLS estimators. The spatial unit roots have also been generalized to Spatial Dynamic Panel Data (SDPD) model by Yu and Lee [16].
In order to investigate the possible spatial unit roots, several test procedures have been proposed. Fingleton [14] suggests a “very high” value of the Moran’s I statistic could be useful for testing spatial unit roots. Lauridsen and Kosfeld [17,18] propose a two-stage Lagrange Multiplier (LM) tests that distinguish the spatial unit roots from the stationary positive spatial correlation. By the fact that spatial impulses do not die out under spatial unit roots and lead to explosive variance, Beenstock et al. [19] numerically calculate the critical value even under an irregular spatial weight matrix. These tests have been extensively used in the literature; see Yesilyurt and Elhorst [20], Olejnik [21], Machado et al. [22], Beenstock and Felsenstein [23].
This paper is organized as follows. Section 2 introduces some basic concepts in spatial econometrics. Next, the parameter space of the spatial autoregressive coefficient and the corresponding singular points are considered. Stationarity and spatial cointegration concepts derived from spatial unit roots are introduced. General assumptions and the corresponding implications in spatial econometrics are discussed. Section 3 introduces the potential problems with the existence of spatial unit roots: spurious and nonsense regression. Section 4 investigates estimation methods and inference under spatial unit roots. Section 5 discusses how to test for spatial unit roots. Section 6 discusses spatial unit roots in the SAR model while Section 7 concludes.
2. Basic Concepts in Spatial Econometrics
Spatial models study the spatial dependence between units. In practice, the spatial weight matrix is used to describe such dependence. Let be the contiguity-based spatial weight matrix, i.e., if unit i and j are contiguous and 0 otherwise. Also, the diagonal elements are set to 0 by convention. In practice, the spatial weight matrix is generally row-normalized by , so the row sum of will be one.
Different spatial model specifications have different implications. The SEM with spatial autoregressive (SAR) structure on the error vector can be expressed as , where is known as the spatial coefficient and satisfies some assumptions that will be introduced later, and is the independent and identically distributed (i.i.d.) innovations with variance . The error covariance matrix for the with SAR structure is
where and . Though may be sparse, is not necessarily so, thus the spatial covariance structure induced by such SEM model is classified as global. Conversely, a spatial moving average (SMA) specification for the error vector can be expressed as , and the corresponding covariance matrix will be
including only and which are first and second order neighbors if is defined as first-order contiguity. Hence, such a model is generally classified as local. See Baltagi et al. [24].
Kelejian and Prucha [7] consider a “cross-sectional (first-order) autoregressive spatial model with (first-order) autoregressive disturbances” (SARAR) and is labeled as spatial autoregressive combined (SAC) model. If the right-hand side includes both the independent variable and the spatially lagged dependent variable then it is termed the mixed regressive, spatial autoregression (MRSAR) or mixed SAR model. The spatial Durbin model (SDM) includes both the spatial lagged dependent variable and independent variables. A full model labeled as general nesting spatial (GNS) model given in Elhorst [3] is
where is referred as spatial autoregressive coefficient (SAC) and is called the spatial autocorrelation coefficient.
2.1. Parameter Space of the Spatial Autoregressive Coefficient
Consider the pure SAR model with data generating process (DGP):
where is an vector of observations on the dependent variable, is an spatial weight matrix and is an vector of disturbances which are assumed to be i.i.d. . The reduced form equation of can be written as
where . So to guarantee the system achieves equilibrium, a crucial assumption is that the absolute value of the spatial coefficient is strictly less than 1 (see Kelejian and Prucha [6] (Assumption 2), Kelejian and Prucha [7] (Assumption 2)) to ensure the nonsingularity of . This assumption follows from a sufficient condition for the invertibility matrix in Horn and Johnson [25] (Corollary 5.6.16, p. 351):
Theorem 1.
An matrix is nonsingular if there exists a matrix norm such that . If this condition is satisified, .
Thus, is invertible if there exists a matrix norm such that . It is also well known that any norm of a matrix is larger than all of its eigenvalues. Let be the eigenvector matrix and , , be the egienvalues of , then
So it is easy to see that and therefore
because so that . A useful result is given in Ord [4]:
Theorem 2.
If has eigenvalues , . Then for , .
Moreover, the log-likelihood function for , given in (4) is
and , of which a sufficient condition is , for all i. Again, since , we obtain the range of that is given in (7).
However, either by Theorem 1 or 2, (7) is a sufficient condition for the invertibility of . The singular points of are by Theorem 2, and the number of these singular points are at most countably many as . This raises a problem as stated in Kelejian and Robinson [26]. These singular points can be determined generally by the nth polynomial numerically, and to avoid inconsistency, they should be removed from the possible values of . Griffith [27] (p. 19) states that this condition also ensures stationarity, but Kelejian and Robinson [26] give a counter example showing that it does not when is a row-normalized double queen weight matrix.
When the matrix is row-normalized, Kelejian and Prucha [6] (Note 8, p. 120) show that by Geršgorin’s theorem, and typically, [26]. We assume , and this is why is generally interpreted as the spatial autocorrelation coefficient similar to its counterpart in time series.
2.2. Stationarity, Order of Integration and Cointegration
Stationarity is a key assumption in time series. Similarly, in spatial econometrics, when the stationary assumption does not hold, spurious or nonsense regression appears, as will be shown in the next section. Stationarity is tightly connected with . The formal definition of stationarity is given in Anselin [1] (p. 42):
Definition 1.
A process is strictly stationary if any finite subset from the stochastic process has the same joint distribution as the subset for any s, where s represents an uniform shift in time, space or time–space.
But we generally consider a weaker version, covariance stationarity. For the intuition of stationarity and the connection with the inverse of , see Beenstock et al. [19]. Consider the pure SAR model in (4) and (5), the covariance matrix for is
where and is defined in (5). Letting be the element of the matrix , by normalizing , has variance and covariance with . By Definition 1, stationarity requires that and remain unchanged asymptotically (this implicitly assumes that both location j and k are far away from the edge). Note that by Theorem 1, we have
So the stationarity assumption is equivalent to , for all k as . Since and , m represents the “remote” area, and is the step neighborhood of unit k. Thus, intuitively, stationarity requires that the shocks from far away locations will asymptotically not affect the epicenter area.
Another two concepts tightly connected with unit roots and stationarity are the order of integration and cointegration. The order of integration is originally a concept in time series that describes the minimum number of differences that a non-stationary process needs to be (covariance) stationary. Cointegration, on the other hand, describes the minimum order of integration of a combination of two or more series with the same order of integration. The formal definitions of the order of integration and cointegration are given in Hamilton [28]:
Definition 2.
A time series is integrated of order d, denoted , if is a stationary process, where L is the lag operator and is the first difference.
Definition 3.
Time series X and Y are cointegrated of order , if both of them are , and there exists a cointegrated vector such that .
In time series, the lag operator L is defined by because of the natural order of temporal data. In spatial econometrics, we regard as the spatial lag operator and as the first order spatial difference, see Anselin [1] (pp. 22–26). We also use and to refer to spatial integration and spatial cointegration respectively.
More specifically, for the pure SAR model in (4) with a row-normalized weight matrix, if , since is stationary. Also, suppose both and are , but they have a long-term equilibrium relationship , then obviously, are with cointegrated vector .
2.3. Some Fundamental Assumptions
Different assumptions are made in spatial econometrics for different estimation methods. The most common ones are listed here, and the implications are explained.
Assumption 1.
The disturbances are i.i.d. for all n (so uniformly) with zero mean and finite variance . Additionally, fourth moments exist.
Assumption 2.
The elements of the exogenous variables are uniformly bounded for all n. The exists and is nonsingular.
Assumption 3.
The matrix is nonsingular.
The existence of up to the fourth moment of disturbances is needed to apply the central limit theorem for (a system) of the linear-quadratic form (see Kelejian and Prucha [29] (p. 226) and Kelejian and Prucha [30] (p. 63)). The nonsigularity of makes sure the system achieves an equilibrium as well as ensuring that the mean and variance of exist.
Assumption 4.
The matrices and are uniformly bounded (UB) in both row and column sums for all n. (We say a matrix is UB in row (column) sums if its maximum row (column) sum is finite. This property preserves under finite matrix multiplication.)
The UB condition for implicitly assumes a limited number of neighbors for all units even as , so the weight matrix is sparse for large n. This assumption is relaxed in Lee et al. [31] by introducing dominant (popular) units. In practice, the spatial units have a limited number of neighbors. Though sometimes may be defined as the inverse of the distance between i, j physically or economically, tends to be 0 between far away units as n increases. So in general this assumption is satisfied. The UB of is to ensure the covariance matrix in (9) is still UB, which limits the correlation between two different units since the UB property preserves under matrix multiplication.
Other assumptions to ensure identification conditions or the derivation of asymptotic distributions of estimators will be mentioned when needed.
3. Spurious Regression When (Near) Unit Roots Exist
The variance of explodes when unit roots exist, and OLS estimation may perform unsatisfactorily: the estimators are inconsistent, the test statistics do not have familiar distributions, and may even converge to a constant. These phenomena have been studied extensively in time series, and similar symptoms occur in spatial econometrics.
3.1. Spurious Regression of Driftless Series and Spatial Integration
Fingleton [14] studies unit roots and spatial cointegration in spatial econometrics. Using Monte Carlo simulations, he finds that spatial unit roots will lead to a spurious regression and proves that when two vectors are spatially cointegrated, even running a regression on the error-correction model yields inconsistent estimates. Beenstock et al. [19] distinguish between the terms spurious regression and nonsense regression and argue that, Fingleton [14] refers to nonsense regression instead of spurious regression. When and are driftless random walks, the nonsense regression occurs because of the increased variances of and over time. On the other hand, the spurious regression occurs when and are independent random walks with drift, which causes their means to increase over time. See also Mur and Trívez [32].
To run the simulation, two independent pure SAR processes and containing spatial unit roots are generated separately as in (5). But as discussed in Section 2.1, does not exist under a row-normalized weight matrix when . To avoid this, Fingleton introduces the “unconnected central cell”, which manually sets one row of the spatial weight matrix equal to 0 to avoid circularity. This is a time-series analogy because there is always a starting point in temporal data (). By doing so, the singular point is slightly larger than 1, and the existence of is ensured [32]. Regressing on , the t-statistic and coefficient of determination show the significance of the parameter between two unrelated variables when spatial unit roots exist. Letting e be the OLS residuals, Moran’s I, defined as
is the spatial version of the Durbin–Watson statistic, and thus is a measure of spatial autocorrelation. The simulation results of Moran’s I show a high level of positive spatial autocorrelation in the residuals and evidence for the presence of a spurious regression.
To remedy this situation, spatial differencing is introduced to the SAR process with unit roots:
where and . When both and are spatial processes, we have and . The regression of the first-order spatial difference variable is equivalent to regressing two independent processes, which should theoretically yield .
Next, spatially cointegrated series are considered. To generate , the “error-correction representation” is used. The idea is adopted from Robert [33] that the existence of error-correction representation is a necessary and sufficient condition for a cointegrated time series. The spatial analogy is
where is the equilibrium error assumed (for simplicity) stationary and hence the name “error-correction”. The spatial unit root series and have a long-term equilibrium . Note that (13) has two equations and two unknowns and is a noncircular matrix.
Moran’s I statistic may act as a useful indicator for cointegration because the cointegration regression (regress on ) involves endogenous variables. Also, the first-order regression is inappropriate because of omitted variable bias concerning the equilibrium error . Rearranging (13) yields the appropriate specifications:
But OLS estimation for either c or d is inconsistent because of the presence of a spatially lagged dependent variable, which is different from the traditional time series counterpart.
3.2. Spurious Regression with Deterministic Trends
Fingleton [14] studies the effect of spatial unit roots by simulation while Mur and Trívez [32] show that the variance of the spatial unit roots series explodes. For the DGP in (4)
since the contiguity-based spatial weight matrix is symmetric, can be decomposed as no matter whether it is row-normalized or not. Thus where is the eigenvalue matrix of and thus is also diagonal. So, Mur and Trívez [32] derive the variance of as
with
Let be the element of row i and column j of the matrix , and the variance of the r-th element of is then
If is row-normalized, at least one , which means that if the spatial unit roots exist and , explodes. If is not row-normalized so that is symmetric and orthogonal with , , then Mur and Trívez [32] show the variance of the observation at r reduces to
But when is not one of the singular points of , is not necessarily a function of n, i.e., the variances of do not increase as the sample size grows [32]. This is in line with the discussion of stationarity in Section 2.2 and reveals the possible source for nonsense regression concerned with the spatial unit root SAR series [34] (p. 303).
Mur and Trívez [32] focus on the spurious regression when a spatial deterministic trend exists and show that under such circumstances similar symptoms related to unit roots occur. Consider the DGP
where is an unit vector and . Comparing the term in (22) with the time trends in a typical time series model, “” is similar to the time trend “t”: t is different in terms of the relative position in time and the element of is different in terms of its relative position in space. Also, the presence of such a trend term in the SAR process leads to spurious regression. Consider the simple regression
where and are unrelated SAR processes generated by (20), respectively
Assuming and are independent white noises, we expect the estimate of in (23) to be 0. However, this is generally not the case which means spurious regression occurs. This can be seen from the fact that the correlation coefficient between and given in Mur and Trívez [32]
Though Fingleton argues that a high value of Moran’s I statistics may be a good indicator for the existence of spatial unit roots and spatial cointegration, he cannot distinguish between them, or even from the (genuine) positive spatial autocorrelation case. Some testing methods are developed and summarized in Section 5. The trend SAR series proposed by Mur and Trívez [32] seems to receive less attention, which may be due to the fact that when the mixed SAR process contains only a constant exogenous variable and is row-normalized, multicollinearity occurs; see Kelejian and Prucha [7] (p. 105), Lee [35] (p. 258), Lee [5] (p. 1907).
3.3. Spurious Regression under the near Unit Roots with a Row-Normalized, Circular Weight Matrix
The spurious regression considered in the previous two sections is under a row-normalized, noncircular weight matrix, which implicitly assumes an unconnected central unit in the spatial system, and is too restrictive to be used in empirical applications. Thus, Lee and Yu [34] study the spurious regression under a circular, row-normalized spatial weight matrix. The DGP process for the (mixed) SAR series is
In this case, , since the unit roots are singular points of . They study the consequence when approaches 1, namely
where as .
3.3.1. Decomposition of
Though the variance of explodes as , can be decomposed into a stable part and an unstable part by the decomposition of the weight matrix . This decomposition given in Lee and Yu [15] is used in Yu et al. [10,36], Yu and Lee [16] to study unit roots in a spatial dynamic panel data (SDPD) model. Because of its importance, this procedure is summarized here. See Baltagi et al. [37] and Lee and Yu [15,34] for more information.
Theorem 3.
Suppose that is a row-normalized weight matrix from a symmetric matrix , i.e., , where is a diagonal matrix with its diagonal elements formed by the row sums of . Then (i) the eigenvalues of are all real; and (ii) is diagonalizable.
(i) can be easily seen from the fact that all symmetric matrices have real eigenvalues. For (ii)
Let be the eigenvalue matrix of , and be the corresponding orthogonal eigenvector matrix, i.e., . Lee and Yu [15] show can be expressed as:
Let , . By definition, is the eigenvector of and is the corresponding eigenvalue, so the eigendecomposition of is
Moreover, the largest eigenvalues of a row-normalized matrix are 1 in absolute value. Without loss of generality, Lee and Yu [15] assume there are eigenvalues equal to 1 and let
where is vector of ones and for all i. So the eigenvalue matrix can be decomposed into two parts:
where and . Accordingly,
where and .
Lee and Yu [15] note that and , so
Denoting , and where is the true value of , they obtain and thus
since . Similarly,
It can be seen from (38) that when , is ill-conditioned because and hence the variance of explodes. This is caused by the first unstable term in (40) (the second term is stable). Thus, is of order , which may grow too fast to yield useful asymptotic analysis. Thus, a rate-adjusted factor is needed to maintain a controllable rate. A similar idea applies to QMLE and 2SLS methods for estimation as shown later.
3.3.2. Spurious Regression of OLS under Near Unit Root
To study spurious regression, following Fingleton [14], Lee and Yu [34] consider the DGP that is similar to (26) but without exogenous variables:
Denote , where is the vector of ones. Let be a scalar, be an vector and . OLS, which may yield spurious regression, is then
where is an vector of disturbances. To make sure the variable of interest is under controllable order, scale and as
Lee and Yu [34] introduce a sufficient condition for Assumption 4 that ensures the UB of , by (38):
Assumption 5.
and are .
Under Assumption 5, is UB. To study the properties of OLS estimation, it is sufficient to show the asymptotic behaviors of and , where . Proofs of these properties are about orders of matrices and random vectors as well as first and second moments of quadratic forms (some useful lemmas can be found at https://www.asc.ohio-state.edu/lee.1777/wp/sar-qml-r-appen-04feb.pdf, accessed on 28 March 2024). The OLS estimates for and : can be expressed in terms of and
where
Lee and Yu [34] notice that the scaling factor is needed because the columns of have different orders and is of order . Based on this fact, they give the following result:
where , means the remaing terms of are at most of order and symbols are defined in (43), for all .
Moreover, has limiting variance matrix:
Lee and Yu [34] also adjust Assumption 2 to ensure is of full rank when and obtain the asymptotic distributions of the OLS estimators:
Assumption 6.
.
Denoting
then
Since is independent of , one may expect insignificant in (42). However, whether converges to 0 in probability or not depends on the factor for (note that in (49) is diagonal): if , is -consistent; if , is asymptotically normal because its limiting variance does not converge to 0; if , is not bounded in probability and diverges. Intuitively, spurious regression will not occur if approaches ∞ faster than , or equivalently, approaches 1 more quickly than [34].
3.3.3. Other Test Statistics
It is also important to discuss the statistical properties of other test statistics based on (42), which could be potentially useful for distinguishing spurious regression. Lee and Yu [34] give the following theorem as a prerequisite:
Theorem 4.
Under Assumptions 1 and 4, for any nonstochastic UB square matrix ,
where is the projection matrix of
Based on this theorem, Lee and Yu [34] show the order of the estimated variance of the disturbances is
and
where is an random vector with its elements . Thus does not have a familiar asymptotic standard normal distribution, and the F-statistic has no familiar distribution either.
Even though the t- and F-statistic are not reliable, Lee and Yu [34] suggest the combination of and Moran’s I could be a good indicator for spurious regression under near unit roots. Let , where is an vector of ones, the coefficient of determination is
And the Moran’s I statistic is
3.3.4. Constant Terms in the DGP of ’s
Lee and Yu [34] also study the constant term and unit roots at the same time. It could be shown that the estimation of is the same as in (49) after reparameterization. Consider the DGP of series with near unit roots and a constant term as
Regress on and :
where is a constant and is similarly defined as above. Since is row-normalized, , we have . So (57) could be rewritten as:
3.4. “Spurious” Regression with Equal Weights
Baltagi and Liu [38] show that under the special case where the spatial weight matrix is row-normalized and with equal spatial weights, i.e.,
spurious regression will not occur. This spatial weight matrix “is naturally suggested if all units are neighbors to each other and there is no other natural or observable measure of distance [39]”, such as interactions between students in a class or workers in a firm, etc. Without loss of generality, the DGP is assumed to be
Consider the regression
By the Frisch–Waugh–Lovell Theorem, the OLS estimation of is , where and is an matrix of ones. Kelejian and Prucha [39] show the inverse of the matrix , , is
where and . Using the fact that , Baltagi and Liu [38] show and , so the asymptotic distribution of is given by
The asymptotic distribution of does not depend on and and is consistent (compared with (49)), which means that the spurious regression does not occur.
4. Estimation and Inference
The spurious regression studied in the previous section suggests that OLS is not a good estimator in the SAR model with spatial unit roots. QMLE and 2SLS are alternative methods of estimation. This section briefly reviews these estimators and their performance under near unit roots.
When the error term of the DGP has a spatial autoregressive structure, OLS and (feasible) generalized least squares ((F)GLS) estimators are consistent. The efficiency of the OLS estimator was considered by Krämer and Donninger [40], Tilke [41] for the symmetric spatial weight matrices, and generalized by Krämer and Baltagi [42] with a broader covariance matrix. But the symmetry of the weight matrix is too restrictive to be used in practice, so Martellosio [43] generalizes this to nonsymmetric weights matrices. The efficiency of the OLS estimator is defined as . But these papers generally focus on the relationship between and X, for example, when the column space of is contained by the column space of X. Following Lee and Yu [34], Baltagi et al. [37] derive the asymptotic properties of OLS, (F)GLS of and point out important differences from conventional theory based on stationary spatial error.
A special SAR model has also been considered in regular lattices under spatial unit roots with the form
The simplest case of this special SAR model is the doubly geometric spatial autoregressive process:
It is called “unilateral” because only the previous units have effects on the latter ones and have a lower triangular weight matrix. It can be considered as the combination of two autoregressive (AR) models [44]. This model has been widely used in the area of image processing, agriculture trials, digital filtering, and other different fields. Model (64) is unstable when either or because of the existence of spatial unit roots [45,46].
A more complicated special case of the unilateral model is
Because of its simple form, the estimation and inference of (64) can be derived without too many assumptions, as will be seen shortly. Moreover, when spatial unit roots exist, the limit of the variance of are analytically obtained in Baran [47] when the parameters are located on the interior, on the edges, and on the vertices of the domain of stability. Paulauskas [48] explicitly shows that the growth rate of the variance of Y is different in dimensions and . Though this approach studies spatial unit roots from a different angle than that of Fingleton [14], it points to some similarity as in a recent paper [49] that will be discussed in Section 6.
Another possible way to remedy the problem caused by spatial unit roots in the SAR, SEM model is to relax the compactness assumption. When some parameters approach the boundary of the parameter space, consistency of extremum estimators could be obtained with compact parameter spaces. Thus the compactness assumption is standard in spatial econometrics because of the existence existence of the singular point , see proofs in Kelejian and Prucha [6] Lee [5], Gupta [50]. But such an assumption is also restrictive in the sense that if we choose an arbitrary compact set on the open parameter space, the true global optimizer may be exclusive, especially for near unit root cases. A recent paper by Liu et al. [51] generalizes Theorem 2.7 in Newey and McFadden [52] (p. 2133), which relaxes compactness when the objective function of an extremum estimator is concave and allows the non-stochastic objective functions to depend on the sample size n. This generalization is suitable for spatial econometrics models because the sample observations are usually in triangular arrays. (A triangular array is a doubly indexed sequence of numbers or polynomials. Each row of the array is only as long as the row’s index. For example, the ith row contains only i elements.) The consistency of the QMLE of the SAR model and the MLE of the SAR Tobit model are investigated. But a closed-form solution is not obtained. On the other hand, Gupta [53] proposes a Newton-step computational algorithm of QMLE for a large-parameter-size SAR model, which is free from the compactness assumptions. Under the normality assumption, it has the same asymptotic efficiency as MLE, but has a closed-form solution and is computationally simple.
4.1. QMLE and 2SLS Methods for the (Mixed) SAR Model
4.1.1. Quasi-Maximum Likelihood Estimation Method
Lee [5] investigates the asymptotic distribution of the QMLE estimator of the mixed SAR model, which is the starting point for further analysis when spatial unit roots exist. This analysis is based on the discussion of the singularity of the information matrix of the log-likelihood function. Especially when the information matrix is singular, a scaling factor is needed, where is the order of the elements of the spatial weight matrix and thus is the order elements of . We have seen in (39) that the order of elements of is in the near unit roots case; thus, a similar scaling factor will be needed.
The (mixed) SAR model under consideration is
with its reduced form
since .
Also, Lee [5] imposes a weaker assumption about the spatial weight matrix and derives the information matrix.
Assumption 7.
The elements of are at most of order , denoted by , uniformly in all where the rate sequence can be bounded or divergent. The ratio as n goes to infinity.
Let , where , then the log-likelihood function of (67) is
where . The information matrix is
where
with . The existence of the extra is because is not necessarily normally distributed.
To ensure the asymptotic distribution of QMLE exists, must be well defined. Lee [5] proves the nonsingularity of can be guaranteed by the fact that there does not exist a nonzero vector such that a linear combination of columns of is 0. This condition could be simplified as: there does not exist a , such that
Since each term in (72) is greater or equal to 0 (the first term is non-negative because it is symmetric; for the second term, where and ), Lee [5] studies the singularity of the information matrix in terms of these two terms, respectively. For the first term, by the partition matrix formula, is nonsingular if and only if and are nonsingular. Moreover, under Assumption 7, if and are independent, one sufficient condition for the nonsingularity of could be:
Assumption 8.
exists and is nonsingular.
However, Lee [5] states that if and are linearly dependent, for example, when is row-normalized and the relevant regressor is only the constant term, Assumption 8 should be adjusted to guarantee the second term in (72) is greater than 0:
Assumption 9.
and the is a bounded sequence and, for any ,
Then under Assumption 7 and either 8 or 9, the asymptotic distribution of QMLE will be
where , .
The above results are based on being invertible of which a necessary condition is that is a bounded sequence. However, when will become singular because . Also, the singularity of the information matrix implies that the score function will be too flat to be useful and thus an adjustment of the rate should be imposed as in Lee [5]
Assumption 10.
The is a divergent sequence, elements of have the uniform order , and with . Under this situation, either (a) , or (b) if
whenever .
Lee [5] gives the asymptotic distributions of the QMLE under this rate-adjusted assumption:
4.1.2. Generalized Spatial Two-Stage Least Squares (GS2SLS) Method
Kelejian and Prucha [7] proposed the GS2SLS method for the “cross-sectional (first-order) autoregressive spatial model with (first-order) autoregressive disturbances”:
where and . Since is endogenous, it should be instrumented. Assume is known (a consistent estimator is given by Kelejian and Prucha [6]), a Cochrane–Orcutt (CO)-type transformation applied to (77) yield the transformed regression
should be instrumented. The ideal instruments are of course
Then, by (10), , the ideal instruments are:
In practice, is used.
Assumptions about the instrument matrix are made to ensure that exists and is of full rank. In the near unit roots case, similar to the QMLE method, scaling factors are needed to guarantee this property as we will see later. With these assumptions, the GS2SLS procedure has three steps, as in Kelejian and Prucha [7], as follows:
- Run 2SLS on with instruments . This yields , where , is the projection matrix of , and .
- Estimate by Kelejian and Prucha [6] according to the GMM system: where , then solve , or by . Both and are consistent, but is more efficient.
- Assuming is known, run 2SLS on the CO transformed regression (78) with instruments yields , whereBy replacing by its consistent estimation (in Step 2). The feasible 2SLS estimator is
Obviously, this procedure is for a SARAR model. If it is a SAR model, only step 1 is needed and .
4.1.3. Best Generalized Spatial Two Stage Least Squares (BGS2SLS) Estimators
Lee [8] does not drop the higher-order terms in (80) but use the fact that by the definition of and proposes the best instrument:
With the corresponding simple instrumental variable estimator, the BGS2SLS is
If there is no SAR structure in the disturbance term, the best instrument is
and in (85) could be replaced by any consistent estimator such as the KP-GS2SLS estimator.
Compared with (80), does not drop off the higher-order terms and is numerically equivalent to the ideal instrument, which in turn yields asymptotically optimal instrumental variable estimators.
4.2. Near Unit Roots in the SAR Model
Lee and Yu [15] study the asymptotic distribution of QMLE and 2SLS estiamtors of the SAR model by decomposing (see Section 3.3.1). The model is given as
where . And the reduced form, again, is
4.2.1. QMLE
Obviously, the generated regressor is explosive because of in , see (39). This is very similar to the case in Assumption 10 (a) with in Section 4.1.1. This implies that the convergence rate of estimators of is not the usual case as in (76).
Additional assumptions are made in Lee and Yu [15] to control the magnitude of the unstable part of (Assumption 11), and specify the identification condition (Assumption 12), which are adjusted from Assumption 8:
Assumption 11.
(1) ; (2) ; (3) for any finite constant .
Assumption 12.
holds.
Assumption 11 (1) (2) guarantee that is not too small compared to n and (3) implies that it is not too large. Assumption 12 is equivalent to (see Lee and Yu [15] (p. 338, Lemma 1 (7))) , which is a rate-adjusted version of Assumption 8, that ensures the identification uniqueness and implies nonsingularity of . For a detailed discussion of these two assumptions, see Baltagi et al. [37] (p. 6). Also, since the adjusted rate is , QMLE would be -consistent, see (76) derived under Assumption 10.
The information matrix will be the same as in (70). In Section 3.3.2, when studying the spurious regression of OLS, we mentioned that the scaling factor is needed in terms of the order of . Here, for the QMLE, similar scaling factor will also be introduced because elements of and have different orders as the existence of . The second column and row of and , which are the derivatives with respect to the spatial coefficient contain , thus, they have to be scaled by a factor . Specifically, the element should be scaled by a factor . This can be done by a left and right multiplying matrix , where
Thus, Lee and Yu [15] give
Let and and assume they exist, the asymptotic distribution of is
Recently, Rossi and Lieberman [54] combine the near unit roots with a similarity-based weighted matrix and study the consistency of the QMLE estimator when the spatial coefficient and , by allowing uncentered units. The element of similarity-based weighted matrix is , where is some function that measures the similarity between unit i and j according to some parameter . The parameters they are most interested in are . They establish the connection between the and the order of uniform absolute row-sum norm of , . This means that [54] (p. 11, Proposition 1). Recall that , so when , the variance is independent of n, corresponding to the standard SAR setup ( and fixed). In the case , the variance increases with the sample size but with a lower speed, which is the case that we have seen when studying near unit root; but when , , the variance increases so fast that the non-standard limit distribution of has to be established on a case-by-case basis, according to the resulting . Their result is much more complicated than that of Lee and Yu [15] because of the introduction of similarity structure in the weight matrix, but are much more flexible since now is no longer fixed but data-driven, and is potentially more useful in empirical work.
4.2.2. GS2SLS and BGS2SLS
Lee and Yu [15] derive the 2SLS estimators and their asymptotic distributions using the procedures mentioned above. Using instruments defined in (80), the GS2SLS estimator of is
Since contains , the adjustment by is needed. The asymptotic properties of the GS2SLS estimators of and are obtained as follows:
where . So the GS2SLS estimator of is -consistent, which is higher than the usual rate in Kelejian and Prucha [7] and Lee [8], but has the usual rate of convergence.
Choosing the instrument as in (85), the BGS2SLS estimator is with asymptotic distribution:
where . Since is negative semidefinite, the BGS2SLS estimator is more efficient.
The above result is based on the fact that and are independent, which makes sure that the instrument matrix is of full rank, otherwise the 2SLS estimator will be inconsistent [9]. Liu [55] shows that even though and are linearly dependent, i.e., , where is a nonzero vector, we still have , under near unit roots case; and , under the regular case, as long as . This is equivalent to and , which are asymptotically independent.
To provide guidelines for empirical studies, Lee and Yu [15] conduct simulations to compare the performance of QMLE and 2SLS methods. QMLEs are relatively robust whether the error term is normally distributed or not. Moreover, as n increases (and the spatial coefficient is closer to the spatial unit roots), the QMLEs perform better than the 2SLS estimators because of smaller variances. One interesting phenomenon is that the best 2SLS estimators are even worse than the regular 2SLS estimators in some cases, which violates the theoretical result as shown in (93). One possible reason for this is that the best 2SLS estimator requires an initial consistent estimator by construction (see (85)) and under spatial unit roots, such an initial estimator may not be accurately calculated.
4.3. Near Unit Roots in the SEM Model
Baltagi et al. [37] extend the study of near unit roots from SAR model to SEM model by considering the OLS, GLS and FGLS estimation and properties of the corresponding statistics. The model is given as
with . Similar to Lee and Yu [15], could be decomposed to
One more assumption that they impose is
Assumption 13.
The elements of are nonstochastic and bounded, uniformly in n, exists and is nonsingular. exists. Furthermore, exists and is nonsingular.
The OLS estimator has the asymptotic distribution when
and when , . Thus , which is -consistent and is slower than the stationary error term case.
Baltagi et al. [37] also study the asymptotic properties of the GLS and FGLS estimators. If is known, , and
which implies that is robust for the near unit roots in the error term because it has rate of convergence. The feasible GLS (FGLS) could be achieved by replacing by a consistent estimator , which yields
where . It can be seen that FGLS is identical to the QMLE: concentrated log likelihood function of (94) with respect to is
where
The QMLE is of order , which is -consistent and
Comparing with (98), is a FGLS of using . Thus, the QMLE and the infeasible GLS estimator have the same asymptotic distribution as shown before. Next, Baltagi et al. [37] consider the Wald test statistic for the null hypothesis for OLS, GLS and FGLS, where R is a matrix of rank and r is . For OLS,
where
does not have a standard distribution, which is similar to the F-statistic shown above. However, the GLS Wald statistic
has a chi-squared limiting distribution.
Baltagi et al. [37] conduct extensive simulations. Using the root mean squared error (RMSE) as the evaluation criteria, the QML (FGLS) estimators perform uniformly better than the OLS estimator. In particular, when the spatial coefficient is sufficiently close to 1 and the sample size n increases, the RMSE of the OLS estimator grows dramatically. Together with the fact that the Wald test statistic based on the QML method has a standard Chi-squared distribution, QMLE is recommended when near spatial unit roots exist in the spatial error model.
4.4. Doubly Geometric Spatial Autoregressive Process
The main difference between the SAR and the doubly geometric spatial autoregressive models is that the spatial dependence form of the latter is clearly specified. However, for the SAR model, such dependence relies on the specification of whose explicit form varies in different situations.
For model (65), based on the observation , Baran [47] shows that the asymptotic normality of the estimators is in the stable case ( and ), with some covariance matrix . For the unstable case (, ), using the martingale central limit theorem, Bhattacharyya et al. [56,57] show “one step Gauss-Newton” estimators are asymptotically normal with convergence rate . This is different from the classical time series , where the OLS estimator converges to a fraction of functionals of the standard Brownian motion: [58] (p. 281).
Baran and Pap [59] consider the more complicated model as in (66). The model is stable if and only if , where S is the open tetrahedron with vertices . They also prove that the OLS estimator is asymptotically normally distributed with the convergence rate n when the model is stable, and otherwise. (The simpler model , with possibly was investigated in Baran et al. [60], Baran et al. [44], Baran and Pap [61] under stable and unstable cases. Under different settings, the limiting distribution of the OLS estimator is normal but has different rates of convergence.)
Roknossadati and Zarepour [62,63] study the limiting behavior of M-estimation for the near unit roots of model (65). The M-estimator of is defined to minimize of the objective function
for some convex function . Roknossadati and Zarepour [62] show that the self-normalized M-estimators are asymptotically normal, and when the series is stable, the convergence rate of M-estimators is still , same as in Bhattacharyya et al. [56,57]. But if it is unstable, i.e., when the model has infinite variance innovations, the M-estimates have a higher consistency rate.
5. Tests for Spatial Unit Roots and Nonstationarity
Recognizing the possible consequences of spatial unit roots, it is necessary to test for it. In fact, in nonstationary cases, the estimator is inconsistent and diverges [64]. If the series contains spatial unit roots, one may employ the spatial first difference as recommended by Fingleton [14]: after the first-order difference, such series will be converted to a stationary one, otherwise it is over-differenced and spatial correlation still exists. Based on this idea, Lauridsen and Kosfeld [17,18] propose two-stage LM tests to check for spatial unit roots. However, such LM tests have a high power function because of in finite samples and are not useful for spatial cointegration since they mis-specify the regression in the second stage. A Wald test is proposed by Lauridsen and Kosfeld [65] but it does not have a usual distribution so simulation has to be conducted before each test to obtain the critical values. A different approach introduced by Beenstock et al. [19] uses the fact that when spatial unit roots exist, the variance explodes and the spatial impulse does not die out as distance increases, so they iterate on the parameter space to find out the value of the unit roots (for irregular lattice) and then generate nonstationary series to conduct interval estimation.
Martellosio [66] derives the power properties of invariant tests, for example, , where is the OLS residuals and Q is a fixed matrix. When , we obtain the Cliff–Ord test. When the regression contains only a constant, the Cliff–Ord test reduces to Moran’s test as introduced before, which is best locally invariant as shown by King [67]. It has been shown that for the SEM model, as , the test power vanishes. For the SAR model, as , the limiting power is either 0 or 1. Krämer [68] shows similar conclusions but focuses on the symmetric weight matrix. Martellosio [69] further shows the power of any test vanishes as spatial correlation increases for a set of regression spaces. Heteroskedasticity robust tests have been studied. For example, Born and Breitung [70], Baltagi and Yang [71] design diagnostic tests for SEM and SAR employing the outer product of gradients (OPG) variant of the LM test which are robust against heteroskedastic (and non-normal) errors. But these tests suffer from the same deficiency as in Martellosio [66] because such test is asymptotically equivalent to Moran’s I. Baltagi and Yang [72] have also shown that the standard LM test undergoes finite sample distortion when spatial dependence is heavy in both spatial and panel data settings. Recently Preinerstorfer [73] suggests some modified tests to avoid this “zero-power trap” phenomenon, which works well for small spatial autocorrelation, but still has limiting power smaller than 1 (only 0.619 by simulation). Thus, the invariant test of null hypothesis is not satisfactory when spatial unit roots exist, and methodologies to determine it (Tests of the null hypothesis) deserve more attention.
5.1. Two-Stage LM Test for the Sources of Spurious Spatial Regression
Lauridsen and Kosfeld [17] develop a two-stage LM test to distinguish between two possible sources for spurious regression. The first one is the existence of spatial (near) unit roots in the regressand and/or regressors as in Fingleton [14], Lee and Yu [34]; the second is that the spatial error term itself is nonstationary. So the LM tests are essentially testing if the spatial process is stable or not. The idea originates from the fact that Fingleton [14] suggests a high value of Moran’s I statistic as an indicator for both spatial nonstationarity and spurious regression, but we cannot distinguish between them or even distinguish between the nonstationarity and the positive spatial correlation among the error terms, which by definition imply a high value of Moran’s I. Specifically, we are trying to distinguish if (i) the is a SAR process and we regressed on or (ii) the model itself is SEM as , where , because both (i) and (ii) can cause spatial autocorrelation.
There are at least three advantages of the LM tests [17]. First is that compared with Wald or LR, LM is usually simpler to compute because it is constructed under . Second is that with the LM test, it is possible to control for some omitted model features such as heterogeneity and autoregression, as in Anselin [74], which will be discussed later. The last one is that, other statistics may not have a standard asymptotic distribution, such as the OLS Wald type statistic as in Baltagi et al. [37]. The proposed two-stage LM test is based on the SEM model and all four possible results are summarized in Lauridsen and Kosfeld [17] (Table 1):
- Thus large values of reject the null hypothesis, which implies either or .
- The next step is to test if . This could be carried out by using the spatial differencing we introduced before. Under , , thus the first order difference on the regression, , yields i.i.d. error , which means the value of differenced LME (DLME) should be close 0 under . But if , represents overdifferencing, i.e., , so spatial correlation in the error term still exists, and we cannot reject .
Similar procedures could be used to investigate whether or any are spatially nonstationary, as the case in Lee and Yu [34]. Letting be one of , , , Lauridsen and Kosfeld [17] suggest using the regression
to obtain LME and DLME, respectively. is regressed on a constant term because there is no meaningful regressor but we still need the residuals.
Spatial cointegration could also be tested using this LM test. Thus, after determining and are nonstationary, regress on and on to obtain LME and DLME. The cointegration relation exists if LME is 0; and non-cointegration if LME is positive and DLME is 0; the limiting case of “near cointegration” occurs if LME and DLME are positive.
Moreover, Lauridsen and Kosfeld [18] generalize their two-stage LM test to account for unobserved heteroskedasticity. They specify the covariance matrix of , , have the diagonal element , where is vector of observations of exogenous variables for region i, related to via the vector of parameters . So the statistic in (107) should be adjusted as in Anselin [74] (p. 9, Equation (29)):
with and Z as the matrix containing the Z vectors that cause heteroskedasticity.
However, the Lauridsen and Kosfeld test procedure is not without problems. Beenstock et al. [19] point out that this procedure is not suitable for testing spatial cointegration since the second stage is misspecified. To see it more clearly, regress . The LM procedure asserts that if is not spatially correlated, then and are spatially integrated; and if is spatially correlated, and are not spatially integrated because of overdifferencing. Nevertheless, regressing on is equivalent to regressing two white noise series, and ; because Y and X are both I(1). Hence, the corresponding residuals must be not spatially correlated as long as and are independent, regardless of whether is spatially correlated, nonstationary, or not.
5.2. A Wald Test for Spatial Nonstationarity
Lauridsen and Kosfeld [65] suggest a Wald post-test statistic. Based on MLE, under , the general form of the Wald test is , where is the inverse of information matrix. If we specify the null hypothesis as , then with and , we have , where is the diagonal element of V corresponding to . However, as mentioned before, Wald statistics may not have a standard distribution so simulations are conducted. Unlike Fingleton [14], to generate SAR series with spatial unit roots, Lauridsen and Kosfeld [65] do not introduce the noncircular matrix. Thus is a singular point of so the inverse does not exist. To solve this issue, they use the Moore–Penrose pseudoinverse.
According to Monte Carlo simulation, they find the critical limit of the Wald test under spatial nonstationarity is higher than the distribution, especially for the 5th and 10th percentile.
5.3. Test Unit Roots and Cointegration in the Sense of Spatial Impulses
Beenstock et al. [19] come up with an innovative method to test spatial unit roots and spatial cointegration by considering the behavior of the variance and the spatial impulse. Also, they do not assume unconnected spatial units or row normalize the spatial weight matrix either. Thus, based on the topology of the unit neighborhood, the spatial unit roots are in the regular lattices where n is the maximum and general number of neighbors of each unit. For example, , for bilateral space, for rook lattice and for queen lattice, with respectively. (“The weight matrix with first-order contiguity according to the rook criterion has the cells immediately above, below, to the right, and to the left, for a total of four neighboring cells. The weight matrix with first-order contiguity according to the queen criterion is eight cells immediately surrounding the central cell” [75] (p. 131). For the introduction of other types of the spatial weight matrices, see Kelejian and Robinson [26] (pp. 94–95).) And with spatial unit roots, is still well-defined because of the existence of the edge effect, that is, there exist some units having fewer neighbors than n. However, even though exists, the variance tends to explode even in finite sample space, which provides us with a way to determine the spatial unit roots for any arbitrary irregular lattices.
5.3.1. Spatial Impulse
The spatial impulse response is essentially the consequence of the shocks from one location to another. Intuitively, shocks should have no effect on the remote units if the spatial data are stationary. Beenstock et al. [19] first consider the simplest SAR model in lateral space:
where L denote a spatial lag operator such that . The auxiliary equation is
When the discriminant of the above equation is greater than 0, , and there are two different solutions, , by Vieta’s formula. Hence, Beenstock et al. [19] express as
where (112) is known as the Wold representation that expresses in terms of the shocks. The impulse from location to i tends to 0 because . Also, varies with . When , , so the impulse does not die out with distance and explodes. This fact can also be seen from
If , is finite and independent of j; if , , and is infinite.
For the bilateral space case, are the spatial unit roots as shown before. Because of the edge effect, the singular point is strictly greater but approaches . This fact shows a downside of the row-normalized spatial weight matrix: it overstates the true weight of the unit at the edge of the lattice. For example, in a rook lattice, the units have three neighbors with weight at the edge, and four neighbors with weight in the center. The row-normalized procedure assigns a higher weight to the neighbors of edge units. This weight assignment is not necessarily reasonable and makes the spatial unit roots the same as the singular points. Moreover, without row-normalization, the edging units play the role of the unconnected unit as in Fingleton [14]. The general SAR model in bilateral space is and the Wold representation is , where . Let the spatial impulse response be defined as and . Analytical solutions of spatial impulse response in bilateral and higher dimension lattices are not obtained, but Beenstock et al. [19] expect to be positively related to the number of spatial units because of the larger spillover effect and varies inversely with the distance between i and j in the stationary case. If spatial unit roots exist, as in the lateral case, the impulse would not die out as the distance increases. This is supported by the simulation, though only the finite sample case could be simulated, see Beenstock and Felsenstein [76] (Figures 5.2 and 5.4). Compared with , when , it obviously shows a qualitative difference in the persistence of spatial impulses, as well as in the tendency for the explosion of the variance.
In the irregular lattices, the number of neighbors for the unit is undetermined generally, so the spatial unit roots, , cannot be calculated as the reciprocal of n. However, since the nonstationarity implies that spatial impulses do not disappear, one can find the empirical spatial unit roots by simulation.
The simulation method in Beenstock et al. [19] to calculate the critical value is pretty flexible and can be adapted to different models. For example, when both dynamic and spatial terms are included, Beenstock and Felsenstein [23] develop a similar procedure for testing cointegration in nonstationary panel data when estimating the spatial spillover effect in housing construction for Israel.
5.3.2. Spatial Unit Roots and Cointegration Tests
Knowing the spatial unit roots , Beenstock et al. [19] conduct Monte Carlo simulations to generate the artificial SAR series and use the MLE method to estimate SAC to obtain the corresponding distributions under different topologies (different sample size, criteria, etc.). Results show that the empirical distribution of SAC could be used to construct interval estimation and critical values for statistics that test spatial unit roots. For the spatial cointegration test, a similar procedure applies, but OLS estimation is used.
5.4. Some Applications
Kosfeld and Lauridsen [77] offer an application of the two-stage LM test in Section 5.1 to the income and productivity convergence in the German regional labor market. They find highly significant LME and DLME statistics (refer to formulas above like (107)) for all variables, which means the spatial unit roots are rejected. Yesilyurt and Elhorst [20] estimate the spatial interaction effects of inflation in Turkey. Because the regional inflation rates have a high tendency to co-move over time, they question whether the inflation rates of different regions are stationary in space. Using the two-stage LM statistics from (108), they find that the inflation curve is stationary in space. Olejnik [21] studies the income process of the extended European 25 based on the augmented Solow model taking into account the spatial autocorrelation effect. The stationarity of the error term as well as all variables in the model are investigated. No problem of spurious regression is found. Machado et al. [22] examine the spatial correlation of traffic accidents of vulnerable road users (such as pedestrians and cyclists) in big cities and detect the factors that contribute to these accidents. Because their study covers several cities, the model specifications may vary across different locations. Thus they use the two-stage LM statistic to choose the best model, see Machado et al. [22] (Table 4). Though the Wald post-test in Lauridsen and Kosfeld [65] is asymptotically equivalent to the LM test, “It is generally recommended to choose among these alternatives on the basis of computational ease [78] (p. 94)”.
6. Related Topics
Spatial panel data have been studied extensively. The spatial dependence is incorporated in the error component [12,75,79] or by spatial lag dependence [11,80]. See Baltagi [2] for a textbook discussion. Also, the panel data model can have time lagged dependent variables. If the panel data model includes both spatial and dynamic features, it is named as spatial dynamic panel data (SDPD) model by Yu et al. [10]. Yu et al. [10], Yu and Lee [16] and Yu et al. [36] study the QMLE estimator of the SDPD model under stable, unit roots, and spatial cointegration respectively. The concept of unit roots under the SDPD model is a combination of the spatial and dynamic one. To see this more clearly, Yu et al. [10] specify the model as
where and are column vectors. Since , assuming is invertable, the reduced form is
where . If the infinite sums are well-defined, then by continuous substitution
So instead of focusing on the singular points of , should be considered, which contains , and : the parameter of contemporaneous spatial effect, time lagged variable and time–spatial effect. A similar process as in Section 3.3.1, letting be the eigenvalue matrix of , Yu et al. [10] show that the eigenvalue matrix of is , which can be decomposed as . The power matrix of follows as with since the eigenvector matrix is orthogonal and . Thus, whether is stable or not depends on the value of compared with 1. Consequently, the decomposition of , which is a generalization of (40), can be expressed as
where is a possible stable part, is a possible unstable part, and is the time effect part, see Lee and Yu [81] for details. A data transformation procedure is imposed by them to eliminate both the time effects and the possible unstable term. Based on their analysis, the eigenvalues of , the asymptotic properties of QMLE and bias are derived. When eigenvalues of are all less than 1 (), or some equal to 1 ( and ), or all equal to 1 ( and ), the information matrix has different properties, see Yu and Lee [16] (Table 5).
Thus, the test for the unit eigenvalues of is of great importance. Most attention is paid to the unit roots in the time dimension, i.e., and equivalently, if . Unit root tests in panel data under spatial dependence have been extensively studied, see Baltagi [2] (Section 12.3) for a summary. Also, the performance of different tests has been considered in Baltagi et al. [24]. The test for has been investigated in Lee and Yu [81] (Section 14.3.4). Such a test works well when . However, when and T is small, it is not reliable. Thus, further study of the unit root test for the SDPD model should be investigated.
Recently, another approach to describe strong spatial dependence has been proposed by Müller and Watson [49]. Since the spatial units are not neatly arranged, i.e., irregular lattice, they do not model the spatial dependence by SAR model but “posit a continuous parameter model of spatial variation [49]”. They use the Lévy–Brownian motion to define the spatial process, , which is a generalization of the Wiener process that is widely used in time series in d dimensions (when , it could be regarded as a random walk on the plane). The advantage of the Lévy–Brownian motion is that such a process is isotropic, which means the relative variance between two locations is determined by the distance but not the orientation (see Anselin [1] (p. 42)). The functional central limit theorem (FCLT) is established to measure the asymptotic behavior of such a process. When regressing two independent processes, spurious regression also occurs since classical, HAC-corrected and clustered standard errors F statistics diverge to infinity, which is similar to that in Fingleton [14]. To remedy this situation, “difference” regression is again considered. Unlike time series, Müller and Watson [49] introduce the isotropic differences that treat all directions symmetrically. That is, they regard the weighted values of the neighborhood as the “average” value of the current location, just as in the SAR model. And the difference transformation is defined as , where is some weighting function. Their simulations show regressions using isotropic differences do not suffer from spurious regression problems and valid inference can be conducted. Some test procedures for and are also suggested.
7. Conclusions
This paper briefly surveys spatial unit roots in spatial models. First, some fundamental concepts in spatial econometrics are introduced. Spatial unit roots in SAR and SEM models may lead to spurious regression. For the estimation and inference in the presence of spatial unit roots, QMLE and 2SLS methods are generally used and have satisfactory properties after scaling. The compactness assumption has been recently relaxed in spatial econometrics which potentially makes the spatial unit roots no longer a concern but its implication to concepts like stationarity, and spatial cointegration should be further investigated. The doubly geometric spatial autoregressive process, has been widely used in some scientific fields of which the most concern is about regular lattice. Similar to time series, exact orders of convergence for different estimators are obtained because of its simple specification. But this limits its application in economics where irregular lattice and different types of weight matrices are applied.
To detect possible spatial unit roots, as well as spatial cointegration, several test procedures have been proposed. Their applications are rather limited and depend heavily on simulations to obtain critical values. This could be explained by the fact that statistics under spatial scenarios generally do not have standard asymptotic distributions, not to mention the irregular lattice.
Lastly, some related topics were introduced. The idea of singular points is generalized in SDPD model because such a model includes the time lagged variable that is based on the traditional SAR model. However, the existing literature focuses on the temporal unit roots in the SDPD model. Recently, an innovative way to study spatial unit roots describes the underlying spatial process using Lévy–Brownian motion, which is a generalization and spatial analogy to the time series counterpart. The limitations of different approaches and further research were also discussed.
Author Contributions
Conceptualization, B.H.B. and J.S.; methodology, B.H.B. and J.S.; formal analysis, B.H.B. and J.S.; writing—original draft preparation, B.H.B. and J.S.; writing—review and editing, B.H.B. and J.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Acknowledgments
We would like to thank the editors and four anonymous referees for their valuable comments and suggestions.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Anselin, L. Spatial Econometrics: Methods and Models; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1988. [Google Scholar]
- Baltagi, B.H. Econometric Analysis of Panel Data, 6th ed.; Springer: Cham, Switzerland, 2021. [Google Scholar]
- Elhorst, J.P. Spatial Econometrics: From Cross-Sectional Data to Spatial Panels; Springer: Heidelberg, Germany, 2014; Volume 479. [Google Scholar]
- Ord, K. Estimation methods for models of spatial interaction. J. Am. Stat. Assoc. 1975, 70, 120–126. [Google Scholar] [CrossRef]
- Lee, L.F. Asymptotic distribution of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 2004, 72, 1899–1925. [Google Scholar] [CrossRef]
- Kelejian, H.H.; Prucha, I.R. A generalized moments estimator for the autoregressive parameter in a spatial model. Int. Econ. Rev. 1999, 40, 509–533. [Google Scholar] [CrossRef]
- Kelejian, H.H.; Prucha, I.R. A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J. Real Estate Financ. Econ. 1998, 17, 99–121. [Google Scholar] [CrossRef]
- Lee, L.F. Best spatial two-stage least squares estimators for a spatial autoregressive model with autoregressive disturbances. Econom. Rev. 2003, 22, 307–335. [Google Scholar] [CrossRef]
- Lee, L.F. GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. J. Econom. 2007, 137, 489–514. [Google Scholar] [CrossRef]
- Yu, J.; De Jong, R.; Lee, L.F. Quasi-maximum likelihood estimators for spatial dynamic panel data with fixed effects when both n and T are large. J. Econom. 2008, 146, 118–134. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Liu, L. Instrumental variable estimation of a spatial autoregressive panel model with random effects. Econ. Lett. 2011, 111, 135–137. [Google Scholar] [CrossRef]
- Kapoor, M.; Kelejian, H.H.; Prucha, I.R. Panel data models with spatially correlated error components. J. Econom. 2007, 140, 97–130. [Google Scholar] [CrossRef]
- Keller, W.; Shiue, C.H. The origin of spatial interaction. J. Econom. 2007, 140, 304–332. [Google Scholar] [CrossRef]
- Fingleton, B. Spurious spatial regression: Some Monte Carlo results with a spatial unit root and spatial cointegration. J. Reg. Sci. 1999, 39, 1–19. [Google Scholar] [CrossRef]
- Lee, L.F.; Yu, J. Near unit root in the spatial autoregressive model. Spat. Econ. Anal. 2013, 8, 314–351. [Google Scholar] [CrossRef]
- Yu, J.; Lee, L.F. Estimation of unit root spatial dynamic panel data models. Econom. Theory 2010, 26, 1332–1362. [Google Scholar] [CrossRef]
- Lauridsen, J.; Kosfeld, R. A test strategy for spurious spatial regression, spatial nonstationarity, and spatial cointegration. Pap. Reg. Sci. 2006, 85, 363–377. [Google Scholar] [CrossRef]
- Lauridsen, J.; Kosfeld, R. Spatial cointegration and heteroscedasticity. J. Geogr. Syst. 2007, 9, 253–265. [Google Scholar] [CrossRef]
- Beenstock, M.; Feldman, D.; Felsenstein, D. Testing for unit roots and cointegration in spatial cross-section data. Spat. Econ. Anal. 2012, 7, 203–222. [Google Scholar] [CrossRef]
- Yesilyurt, F.; Elhorst, J.P. A regional analysis of inflation dynamics in Turkey. Ann. Reg. Sci. 2014, 52, 1–17. [Google Scholar] [CrossRef]
- Olejnik, A. Using the spatial autoregressively distributed lag model in assessing the regional convergence of per-capita income in the EU25. Pap. Reg. Sci. 2008, 87, 371–385. [Google Scholar] [CrossRef]
- Machado, C.A.S.; Giannotti, M.A.; Chiaravalloti Neto, F.; Tripodi, A.; Persia, L.; Quintanilha, J.A. Characterization of black spot zones for vulnerable road users in São Paulo (Brazil) and Rome (Italy). ISPRS Int. J. Geo-Inf. 2015, 4, 858–882. [Google Scholar] [CrossRef]
- Beenstock, M.; Felsenstein, D. Estimating spatial spillover in housing construction with nonstationary panel data. J. Hous. Econ. 2015, 28, 42–58. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Bresson, G.; Pirotte, A. Panel unit root tests and spatial dependence. J. Appl. Econom. 2007, 22, 339–360. [Google Scholar] [CrossRef]
- Horn, R.A.; Johnson, C.R. Matrix Analysis, 2nd ed.; Cambridge University Press: New York, NY, USA, 2012. [Google Scholar]
- Kelejian, H.H.; Robinson, D.P. Spatial correlation: A suggested alternative to the autoregressive model. In New Directions in Spatial Econometrics; Anselin, L., Florax, R.J.G.M., Eds.; Springer: Berlin/Heidelberg, Germany, 1995; pp. 75–95. [Google Scholar]
- Griffith, D.A. Advanced Spatial Statistics: Special Topics in the Exploration of Quantitative Spatial Data Series; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; Volume 12. [Google Scholar]
- Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 1994. [Google Scholar]
- Kelejian, H.H.; Prucha, I.R. On the asymptotic distribution of the Moran I test statistic with applications. J. Econom. 2001, 104, 219–257. [Google Scholar] [CrossRef]
- Kelejian, H.H.; Prucha, I.R. Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. J. Econom. 2010, 157, 53–67. [Google Scholar] [CrossRef] [PubMed]
- Lee, L.F.; Yang, C.; Yu, J. QML and efficient GMM estimation of spatial autoregressive models with dominant (popular) units. J. Bus. Econ. Stat. 2023, 41, 550–562. [Google Scholar] [CrossRef]
- Mur, J.; Trívez, F.J. Unit roots and deterministic trends in spatial econometric models. Int. Reg. Sci. Rev. 2003, 26, 289–312. [Google Scholar] [CrossRef]
- Robert, E.F. Co-integration and error correction: Representation, estimation, and testing. Econometrica 1987, 55, 251–276. [Google Scholar]
- Lee, L.F.; Yu, J. Spatial nonstationarity and spurious regression: The case with a row-normalized spatial weights matrix. Spat. Econ. Anal. 2009, 4, 301–327. [Google Scholar] [CrossRef]
- Lee, L.F. Consistency and efficiency of least squares estimation for mixed regressive, spatial autoregressive models. Econom. Theory 2002, 18, 252–277. [Google Scholar] [CrossRef]
- Yu, J.; de Jong, R.; Lee, L.F. Estimation for spatial dynamic panel data with fixed effects: The case of spatial cointegration. J. Econom. 2012, 167, 16–37. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Kao, C.; Liu, L. The estimation and testing of a linear regression with near unit root in the spatial autoregressive error term. Spat. Econ. Anal. 2013, 8, 241–270. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Liu, L. Spurious spatial regression with equal weights. Stat. Probab. Lett. 2010, 80, 1640–1642. [Google Scholar] [CrossRef]
- Kelejian, H.H.; Prucha, I.R. 2SLS and OLS in a spatial autoregressive model with equal spatial weights. Reg. Sci. Urban Econ. 2002, 32, 691–707. [Google Scholar] [CrossRef]
- Krämer, W.; Donninger, C. Spatial autocorrelation among errors and the relative efficiency of OLS in the linear regression model. J. Am. Stat. Assoc. 1987, 82, 577–579. [Google Scholar]
- Tilke, C. The relative efficiency of OLS in the linear regression model with spatially autocorrelated errors. Stat. Pap. 1993, 34, 263–270. [Google Scholar] [CrossRef]
- Krämer, W.; Baltagi, B. A general condition for an optimal limiting efficiency of OLS in the general linear regression model. Econ. Lett. 1996, 50, 13–17. [Google Scholar] [CrossRef]
- Martellosio, F. Efficiency of the OLS estimator in the vicinity of a spatial unit root. Stat. Probab. Lett. 2011, 81, 1285–1291. [Google Scholar] [CrossRef]
- Baran, S.; Pap, G.; van Zuijlen, M.C. Asymptotic inference for unit roots in spatial triangular autoregression. Acta Appl. Math. 2007, 96, 17–42. [Google Scholar] [CrossRef][Green Version]
- Basu, S.; Reinsel, G.C. A note on properties of spatial Yule-Walker estimators. J. Stat. Comput. Simul. 1992, 41, 243–255. [Google Scholar] [CrossRef]
- Basu, S.; Reinsel, G.C. Properties of the spatial unilateral first-order ARMA model. Adv. Appl. Probab. 1993, 25, 631–648. [Google Scholar] [CrossRef]
- Baran, S. On the variances of a spatial unit root model. Lith. Math. J. 2011, 51, 122–140. [Google Scholar] [CrossRef][Green Version]
- Paulauskas, V. On unit roots for spatial autoregressive models. J. Multivar. Anal. 2007, 98, 209–226. [Google Scholar] [CrossRef]
- Müller, U.K.; Watson, M.W. Spatial Unit Roots; Princeton University: Princeton, NJ, USA, 2022. [Google Scholar]
- Gupta, A. Estimation of spatial autoregressions with stochastic weight matrices. Econom. Theory 2019, 35, 417–463. [Google Scholar] [CrossRef]
- Liu, T.; Xu, X.; Lee, L.F. Consistency without compactness of the parameter space in spatial econometrics. Econ. Lett. 2022, 210, 110224. [Google Scholar] [CrossRef]
- Newey, W.K.; McFadden, D. Large sample estimation and hypothesis testing. In Handbook of Econometrics; Elsevier: Amsterdam, The Netherlands, 1994; Volume 4, Chapter 35; pp. 2111–2245. [Google Scholar]
- Gupta, A. Efficient closed-form estimation of large spatial autoregressions. J. Econom. 2023, 232, 148–167. [Google Scholar] [CrossRef]
- Rossi, F.; Lieberman, O. Spatial autoregressions with an extended parameter space and similarity-based weights. J. Econom. 2023, 235, 1770–1798. [Google Scholar] [CrossRef]
- Liu, L. A note on 2SLS estimation of the mixed regressive spatial autoregressive model. Econ. Lett. 2015, 134, 49–52. [Google Scholar] [CrossRef]
- Bhattacharyya, B.; Richardson, G.; Franklin, L. Asymptotic inference for near unit roots in spatial autoregression. Ann. Stat. 1997, 25, 1709–1724. [Google Scholar] [CrossRef]
- Bhattacharyya, B.; Khalil, T.; Richardson, G. Gauss-Newton estimation of parameters for a spatial autoregression model. Stat. Probab. Lett. 1996, 28, 173–179. [Google Scholar] [CrossRef]
- Phillips, P.C. Time series regression with a unit root. Econometrica 1987, 55, 277–301. [Google Scholar] [CrossRef]
- Baran, S.; Pap, G. Parameter estimation in a spatial unilateral unit root autoregressive model. J. Multivar. Anal. 2012, 107, 282–305. [Google Scholar] [CrossRef]
- Baran, S.; Pap, G.; Van Zuijlen, M.C. Asymptotic inference for a nearly unstable sequence of stationary spatial AR models. Stat. Probab. Lett. 2004, 69, 53–61. [Google Scholar] [CrossRef][Green Version]
- Baran, S.; Pap, G. On the least squares estimator in a nearly unstable sequence of stationary spatial AR models. J. Multivar. Anal. 2009, 100, 686–698. [Google Scholar] [CrossRef]
- Roknossadati, S.; Zarepour, M. M-estimation for a spatial unilateral autoregressive model with infinite variance innovations. Econom. Theory 2010, 26, 1663–1682. [Google Scholar] [CrossRef]
- Roknossadati, S.; Zarepour, M. M-estimation for near unit roots in spatial autoregression with infinite variance. Statistics 2011, 45, 337–348. [Google Scholar] [CrossRef]
- Ahlgren, N.; Gerkman, L. Inference in unilateral spatial econometric models. Bull. Int. Stat. Inst. 2007, 56, 1–44. [Google Scholar]
- Lauridsen, J.; Kosfeld, R. A Wald test for spatial nonstationarity. Estud. Econ. Apl. 2004, 22, 475–486. [Google Scholar]
- Martellosio, F. Power properties of invariant tests for spatial autocorrelation in linear regression. Econom. Theory 2010, 26, 152–186. [Google Scholar] [CrossRef]
- King, M.L. A small sample property of the Cliff-Ord test for spatial correlation. J. R. Stat. Soc. Ser. B (Methodological) 1981, 43, 263–264. [Google Scholar] [CrossRef]
- Krämer, W. Finite sample power of Cliff–Ord-type tests for spatial disturbance correlation in linear regression. J. Stat. Plan. Inference 2005, 128, 489–496. [Google Scholar] [CrossRef]
- Martellosio, F. Testing for spatial autocorrelation: The regressors that make the power disappear. Econom. Rev. 2012, 31, 215–240. [Google Scholar] [CrossRef]
- Born, B.; Breitung, J. Simple regression-based tests for spatial dependence. Econom. J. 2011, 14, 330–342. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Yang, Z. Heteroskedasticity and non-normality robust LM tests for spatial dependence. Reg. Sci. Urban Econ. 2013, 43, 725–739. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Yang, Z. Standardized LM tests for spatial error dependence in linear or panel regressions. Econom. J. 2013, 16, 103–134. [Google Scholar] [CrossRef]
- Preinerstorfer, D. How to avoid the zero-power trap in testing for correlation. Econom. Theory 2021, 39, 1292–1324. [Google Scholar] [CrossRef]
- Anselin, L. Lagrange multiplier test diagnostics for spatial dependence and spatial heterogeneity. Geogr. Anal. 1988, 20, 1–17. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Song, S.H.; Koh, W. Testing panel data regression models with spatial error correlation. J. Econom. 2003, 117, 123–150. [Google Scholar] [CrossRef]
- Beenstock, M.; Felsenstein, D. Unit root and cointegration tests in spatial cross-section data. In The Econometric Analysis of Non-Stationary Spatial Panel Data; Advances in Spatial Science; Springer: Berlin/Heidelberg, Germany, 2019; Chapter 5; pp. 97–127. [Google Scholar]
- Kosfeld, R.; Lauridsen, J. Dynamic spatial modelling of regional convergence processes. Empir. Econ. 2004, 29, 705–722. [Google Scholar] [CrossRef]
- Vaona, A. Spatial autocorrelation and the sensitivity of RESET: A simulation study. J. Geogr. Syst. 2010, 12, 89–103. [Google Scholar] [CrossRef]
- Fingleton, B. A generalized method of moments estimator for a spatial model with moving average errors, with application to real estate prices. Empir. Econ. 2008, 34, 35–57. [Google Scholar] [CrossRef]
- Baltagi, B.H.; Liu, L. Testing for random effects and spatial lag dependence in panel data models. Stat. Probab. Lett. 2008, 78, 3304–3306. [Google Scholar] [CrossRef]
- Lee, L.F.; Yu, J. A unified transformation approach for the estimation of spatial dynamic panel data models: Stability, spatial cointegration and explosive roots. In Handbook on Empirical Economics and Finance; Ullah, A., Giles, D., Eds.; Taylor & Francis Group: New York, NY, USA, 2011; Chapter 13; pp. 397–434. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).