Open Access
This article is

- freely available
- re-usable

*Econometrics*
**2016**,
*4*(1),
6;
https://doi.org/10.3390/econometrics4010006

Article

Functional-Coefficient Spatial Durbin Models with Nonparametric Spatial Weights: An Application to Economic Growth

^{1}

Department of Economics and Finance, University of Guelph, Guelph, ON N1G2W1, Canada

^{2}

Department of Economics, Süleyman Şah University, Istanbul 34956, Turkey

^{*}

Author to whom correspondence should be addressed.

Academic Editor:
Isabel Casas

Received: 6 November 2015 / Accepted: 19 January 2016 / Published: 3 February 2016

## Abstract

**:**

This paper considers a functional-coefficient spatial Durbin model with nonparametric spatial weights. Applying the series approximation method, we estimate the unknown functional coefficients and spatial weighting functions via a nonparametric two-stage least squares (or 2SLS) estimation method. To further improve estimation accuracy, we also construct a second-step estimator of the unknown functional coefficients by a local linear regression approach. Some Monte Carlo simulation results are reported to assess the finite sample performance of our proposed estimators. We then apply the proposed model to re-examine national economic growth by augmenting the conventional Solow economic growth convergence model with unknown spatial interactive structures of the national economy, as well as country-specific Solow parameters, where the spatial weighting functions and Solow parameters are allowed to be a function of geographical distance and the countries’ openness to trade, respectively.

Keywords:

functional coefficients; local linear regression; nonparametric 2SLS estimator; series estimator; Solow economic growth convergence model; spatial Durbin modelJEL:

C14; C21; O47## 1. Introduction

Ever since the seminal work of [1], there has been a significant amount of empirical work studying variation in economic growth rates across countries. Particularly, more and more economists have paid attention to the impacts of economic interaction and spillover effects on the regional and national economy in the past two decades; see, e.g., [2] for taxation and the global allocation of capital, [3] for cross-border foreign direct investment decisions, [4,5] for economic growth models with worldwide interactions, [6] for country interactions in discretionary fiscal policy and [7] for an overview of empirical studies of strategic interaction among governments over environmental standards and public expenditures. In the meanwhile, econometric theory in parametric spatial regression models has been introduced and well developed to analyse the spatial and economic externalities; for a detailed survey on parametric spatial econometric models, see [8,9,10,11]. This paper joins the others to examine the impact of cross-country economic externalities on national growth through a Solow growth model augmented with economic externalities.

The role of spatial dependence in regional economic growth has received substantial attention in the empirical growth literature in the recent decade; see, e.g., [12] for a survey on economic growth and space. It has been recognized that a nation’s per capita GDP growth rate is affected not only by its own values of determinants, such as savings, population growth rate and initial level of income, but also by its neighbouring nations’ per capita GDP growth rates and the values of these determinants. For example, Ertur and Koch [4] developed a theoretical growth model with spatial externality resulting from technological interdependence among economies and proposed a spatially-augmented Solow economic growth model yielding a conditional convergence equation with heterogeneous Solow parameters. Note that heterogeneous Solow parameters are also supported by similar studies with no spatial interactions; see, e.g., [13,14,15].

Fitting a spatial Durbin model (SDM) using data from 91 non-oil regions/and countries for the period from 1960 to 1995, Ertur and Koch [4] found positive and significant spatial dependence across these economies together with predicted signs for all coefficients. However, their study suffers from two potential problems. First, the parametric SDM requires researchers to pre-determine the non-stochastic spatial dependence structure among economies before estimating parameters appearing in the model, and the misspecified spatial interactive relations can incur inconsistent estimation and misleading inference. Second, the subsampling method may not be the best way of studying heterogeneous Solow parameters. In this paper, working on the same dataset used in [4], we therefore aim to re-examine the spatial spillover effects of economic growth, while estimating in a nonparametric way the true spatial dependence structure among economies and allowing the Solow parameters to vary with respect to the trade openness of an economy.

Specifically, we propose a functional-coefficient spatial Durbin model with nonparametric spatial weights and estimate the unknown spatial weights and coefficient curves via a series approximation approach by a nonparametric two-stage least squares method. Based on the first-step consistent estimator, we then construct a second-step estimator for the unknown functional coefficients, which is oracle efficient in the sense that the limit distribution of the second-step estimator is the same regardless of whether the spatial weights are known. Moreover, we give our inference on spatial dependence through average direct and indirect impact values with standard errors calculated from the bootstrap method.

The remainder of this paper is organized as follows. Section 2 introduces our proposed semiparametric spatial Durbin model. Section 3 presents our estimation methodology. Section 4 reports results from a small Monte Carlo simulation to examine the finite sample performance of our proposed estimators. Section 5 gives our empirical results. Section 6 concludes.

## 2. Model

In the empirical economic growth literature, DeLong and Summers [16], to the best of our knowledge, is the first study to investigate spatial correlation taking geographical distance into account. Using a sample of 61 countries for the period from 1960 to 1985, they find no significant spatial correlation in their sample. Moreno and Trehan [17], on the other hand, augments [1]’s model with a spatial interactive term and find highly significant spillover effects between geographical neighbours, and they argue that using a border dummy variable instead of a spatial lag term neglects the influence of neighbour countries that do not have a common border with the country of interest; relevant literature includes [18,19,20]. Moreover, Ertur, le Gallo, and Baumont [21] provide strong evidence of spatial dependence in economic convergence processes among European regional economies. Using 155 European regions over the period 1988–2000, Basile [22] also finds some evidence of spatial spillovers across countries.

The question of how to measure spatial interactive relations between any pair of spatial units is answered by defining a neighbourhood set for each spatial unit according to some selected relevant variables. For example, Cliff and Ord [8] specify spatial weights for spatial unit i as the ratio of the length of the common border between units i and j to the geographical distance between them, ${g}_{ij}={b}_{ij}^{\beta}/{d}_{ij}^{\alpha}$, with some parameters $\alpha >0$ and $\beta >0$. A common approach in practice is to use only distance-based weights with a decay parameter α; i.e., the spatial weight from unit j on unit i is defined as ${g}_{ij}=g\left({d}_{ij},\alpha \right)$, where $g\left(\xb7\right)$ is a known function and α is a parameter to be estimated. The popularly-used distance function includes an inverse power function ${g}_{ij}=1/{d}_{ij}^{\alpha}$ or a negative exponential function ${g}_{ij}=exp(-\alpha {d}_{ij})$ (e.g., [23]) for some $\alpha >0$. Moreover, if a cut-off distance is not used, then a non-sparse spatial weight matrix is constructed. This implies that every region is a neighbour of other regions, but the spatial weights depreciate as the distance between two regions increases. For recent studies on spatial weights, see [24,25]. Moreover, the term “neighbours” can also refer to contiguities defined by economic distances (e.g., [26,27,28]) and social networks ([29]).

As the functional form $g\left(\xb7\right)$ is unknown in practice, to avoid misspecifying the wrong spatial weighting function, one can alternatively estimate the unknown spatial weighting function $g\left(\xb7\right)$ from the data via a nonparametric series estimation method. Compared to the parametric spatial modelling approach, the nonparametric approach enables researchers to impose less restrictive assumptions on the spatial weight function; see [27,30] for details. Alternatively, Ahrens and Bhattacharjee [31] proposed to estimate the unknown spatial weights via the LASSO estimation method when the unknown spatial weights matrix is sufficiently sparse.

Ertur and Koch [4] derive a theoretical Solow economic growth model augmented with global technological interdependence. They then approximate their theoretical model by a parametric spatial Durbin model via a linearization procedure and calculate the spatial weights from both the inverse power function and the exponential function of geographic distances with $\alpha =2$ for the sake of robustness, as the true spatial weighting function is unknown. As the linearization and selected parametric spatial weights may result in a model misspecification problem, in this paper, in order to better approximate [4]’s theoretical model, we therefore propose a semiparametric growth model that extends [4]’s parametric model by allowing nonparametric spatial weights, as well as varying Solow coefficients. Specifically, our proposed functional-coefficient spatial Durbin model with nonparametric spatial weights is given by:
where ${Y}_{i}$ is a scalar dependent variable, ${X}_{i}={\left[{X}_{1i},\dots ,{X}_{pi}\right]}^{T}$ is a $p\times 1$ vector, ${D}_{i}$ is a continuous scalar random variable and Z is a non-stochastic spatial covariate with ${Z}_{ii}=0$ and ${Z}_{ij}>0$ for $i\ne j$. Moreover, $g(\xb7)$, ${m}_{t}(\xb7)$ for $t=1,\dots ,p$ and $\theta (\xb7)$ are all unknown measurable smooth functions with $g\left(0\right)=0$ and ${m}_{t}\left(0\right)=0$, $t=1,..,p$. In Model (1), ${X}_{i}$ has a functional coefficient depending on ${D}_{i}$, and the unknown spatial weights are a function of non-stochastic geographic distance ${Z}_{ij}$. The first two terms in the right-hand side of Equation (1), ${\sum}_{j\ne i}g\left({Z}_{ij}\right){Y}_{j}$ and ${\sum}_{j\ne i}{m}_{t}\left({Z}_{ij}\right){X}_{tj}$ for each $t=1,\dots ,p$, are called the spatial lag of the dependent variable and the spatially-lagged exogenous variables, respectively. In the spatial econometrics literature, this kind of model specification is referred to as the spatial Durbin model, where $\left({X}_{i},{D}_{i},{u}_{i}\right)$ are independently distributed across i while only the dependent variable ${Y}_{i}$ is dependently distributed across spatial units. The detailed data information on ${X}_{i}$, ${D}_{i}$, ${Y}_{i}$ and ${Z}_{ij}$ is delayed to Section 5. If the model includes only the spatial lag of the dependent variable, it is called a pure spatial autoregressive (SAR) model. Basile et al. [32] introduced the spatial autoregressive semiparametric geoadditive models to account for spatial dependence, spatial unobserved heterogeneity and unknown functional curves of regressors simultaneously, where the spatial autoregression is represented by a pre-determined spatial lag term of the dependent variable.

$${Y}_{i}=\sum _{j\ne i}g\left({Z}_{ij}\right){Y}_{j}+\sum _{t=1}^{p}\sum _{j\ne i}{m}_{t}\left({Z}_{ij}\right){X}_{tj}+{X}_{i}^{T}\theta \left({D}_{i}\right)+{u}_{i},\phantom{\rule{4.pt}{0ex}}i=1,\dots ,n,$$

Let ${G}_{n}$ and ${M}_{n,t}$ be $n\times n$ unknown spatial weight matrices with its $(i,j)$
where $Y={\left[{Y}_{1},\dots ,{Y}_{n}\right]}^{T}$, ${X}_{t}={\left[{X}_{t1},\cdots ,{X}_{tn}\right]}^{T}$ and $U={\left[{u}_{1},\dots ,{u}_{n}\right]}^{T}$ are all $n\times 1$ vectors and $mtk\left\{X,\theta \left(D\right)\right\}$ is an $n\times 1$ vector with the i

^{th}element being ${g}_{ij}=g\left({Z}_{ij}\right)$ and ${m}_{t,ij}={m}_{t}\left({Z}_{ij}\right)$, respectively, for $t=1,\dots ,p$, $i=1,\dots ,n$, $j=1,\dots ,n$. We then obtain a reduced form of Model (1) written in matrix form:
$$Y={\left({I}_{n}-{G}_{n}\right)}^{-1}[\sum _{t=1}^{p}{M}_{n,t}{X}_{t}+mtk\left\{X,\theta \left(D\right)\right\}+U],\phantom{\rule{4.pt}{0ex}}i=1,\dots ,n,$$

^{th}element equal to ${X}_{i}^{T}\theta \left({D}_{i}\right)$. Furthermore, ${I}_{n}$ is an $n\times n$ identity matrix.Let ${\lambda}_{i}\left({A}_{n}\right)$ be the i

^{th}eigenvalue of an $n\times n$ matrix ${A}_{n}$, $\rho \left({A}_{n}\right)$ $={max}_{1\le i\le n}\left|{\lambda}_{i}\left({A}_{n}\right)\right|$ and ${\u2225{A}_{n}\u2225}_{\infty}$ $={max}_{1\le i\le n}{\sum}_{j=1}^{n}\left|{a}_{ij}\right|$ and ${\u2225{A}_{n}\u2225}_{1}$ $={max}_{1\le j\le n}{\sum}_{i=1}^{n}\left|{a}_{ij}\right|$ be the respective row and column norm of ${A}_{n}$. Furthermore, $C>0$ is a finite positive number that takes different values at different appearances. Below, we impose some regularity conditions on Model (1).**Assumption A1:**(i) $\left\{{Y}_{i}\right\}$ is generated from Model (1), and $\left\{\left({X}_{i},{D}_{i}\right)\right\}$ is independently distributed with finite second moments; (ii) $g\left(\xb7\right)$, ${m}_{t}\left(\xb7\right)$ and $\theta \left(\xb7\right)$ are all uniformly bounded up to their respective p

^{th}-order derivatives for some $p>2$; (iii) $\left\{{u}_{i}\right\}$ is an independent sequence with zero mean, $E\left[{u}_{i}\right|{X}_{i}=x,{D}_{i}=d]=0$ and $E\left[{u}_{i}^{2}\right|{X}_{i}=x,{D}_{i}=d]={\sigma}_{i}^{2}\left(x,d\right)>0$ for all i and $\left(x,d\right)\in {R}^{p}\times R$, and ${sup}_{\left(x,d\right)\in {R}^{p}\times R}{max}_{1\le i\le n}E\left[\right|{u}_{i}{|}^{2+\delta}|{X}_{i}=x,{D}_{i}=d]$ $\le C$ $<\infty $ for some $\delta >0$ and a positive constant C.

**Assumption A2:**(i) There exist a positive integer N and a constant ${c}_{G}\in \left(0,1\right)$, such that for all $n>N$, $\rho \left({G}_{n}\right)\le {c}_{G}$; (ii) ${\u2225{G}_{n}\u2225}_{j}$ $\le C<\infty $, ${\u2225{M}_{n,t}\u2225}_{j}$ $\le C<\infty $ for all t and ${\u2225{\left({I}_{n}-{G}_{n}\right)}^{-1}\u2225}_{j}$ $\le C<\infty $ for $j=1$ and ∞ and some finite value $C>0$.

Assumption A1 (i) states that the explanatory variables $\left({X}_{i},{D}_{i}\right)$ are independent, while the dependent variable ${Y}_{i}$ exhibits spatial dependence; and Assumption A1 (iii) allows the error term, ${u}_{i}$, to be independent with heteroskedasticity, and the bounded higher order moment is required for deriving the limiting normal distribution of the proposed estimator. By [33] (p. 421), Assumption A2 (i) ensures that ${I}_{n}-{G}_{n}$ is a non-singular matrix with ${\left({I}_{n}-{G}_{n}\right)}^{-1}={\sum}_{j=0}^{\infty}{G}_{n}^{j}$, which implies that $\left\{{Y}_{i}\right\}$ is spatially stationary. In addition, we have ${max}_{1\le i,j\le n}\left|{g}_{ij}\right|\le \rho \left({G}_{n}\right)<1$ by Properties 4.66 and 4.67 in [33] (p. 68). It is ready to show that ${n}^{-1}{Y}^{T}Y={O}_{p}\left(1\right)$, and $E\left({Y}_{i}^{2}|{X}_{1}={x}_{1},\dots ,{X}_{n}={x}_{n},{D}_{1}={d}_{1},\dots ,{D}_{n}={d}_{n}\right)$ is continuously differentiable and uniformly bounded under Assumptions A1 and A2. In addition, Assumption A2 (ii) is a regularity condition (see, e.g., Assumption 1 in [34]), and it holds if the spatial weight function, $g\left(z\right)$, decreases to zero for large z and:
where the indicator function $I\left(\mathcal{A}\right)=1$ if event $\mathcal{A}$ holds, and zero otherwise.

$$\underset{n\to \infty}{lim}{n}^{-1}\sum _{i=1}^{n}\sum _{j=1}^{n}I\left({Z}_{ij}\in \mathcal{Z}\right)<\infty \phantom{\rule{4.pt}{0ex}}\text{for}\phantom{\rule{4.pt}{0ex}}\text{any}\phantom{\rule{4.pt}{0ex}}\text{fixed}\phantom{\rule{4.pt}{0ex}}\text{bound}\phantom{\rule{4.pt}{0ex}}\text{set}\phantom{\rule{4.pt}{0ex}}\mathcal{Z},$$

Note that a parametric SDM is given by ${Y}_{i}=\rho {\sum}_{j\ne i}{w}_{ij}{Y}_{j}$ $+\delta {\sum}_{t=1}^{p}{\sum}_{j\ne i}{p}_{t,ij}{X}_{tj}$ $+{X}_{i}^{T}{\theta}_{0}$ $+{u}_{i}$, $i=1,\dots ,n$, where ${W}_{n}$ and ${P}_{n,t}$ are spatial weight matrices with their respective ($i,j$)th element equal to ${w}_{ij}$ and ${p}_{t,ij}$. Therefore, if the parametric spatial Durbin model holds true, the spatial weight matrices ${G}_{n}$ and ${M}_{n,t}$ in Model (1) are equivalent to $\rho {W}_{n}$ and $\delta {P}_{n,t}$ in the spatial Durbin model, respectively. From an estimation and econometric modelling viewpoint, the normalization of spatial weight matrices in the spatial Durbin model is used to identify the spatial multiplier parameters $\left(\rho ,\delta \right)$, but this is not necessary in our proposed Model (1). Therefore, allowing nonparametric spatial weights saves us from applying an ad hoc spatial weight matrix normalization procedure as in the parametric SDM.

## 3. Estimation Methodology

If the dependent variable exhibits spatial autocorrelation, it must be accounted for by incorporating the spatially-lagged dependent variable into the model. If this variable is not included in the model, there would be an omitted variable type specification error due to the fact that unobserved factors may have a direct effect on the response variable. Moreover, the presence of the spatial lag-dependent variable in the model results in a simultaneity bias problem. This can be seen explicitly from the reduced form Model (2). From Model (2), we see $E\left({G}_{n}Y{U}^{T}\right)={G}_{n}{({I}_{n}-{G}_{n})}^{-1}E\left(U{U}^{T}\right)$ is a non-zero matrix, so that the spatial lag of the dependent variable, ${G}_{n}Y$, is correlated with the error term, U, which, therefore, results in an endogeneity problem in Model (1). Therefore, the ordinary least squares (or OLS) estimator would be biased and inconsistent. Moreover, the term ${({I}_{n}-{G}_{n})}^{-1}$ in Equation (2) explains that region i is affected not only by its own determinants, but also by its neighbouring regions’ values. This has been called a global interaction effect in [4] (p. 1044) and [24] (p. 15). Another source of the spatial endogeneity problem is due to the endogeneity of spatial covariate in the model. Recently, Kelejian and Piras [35] and Sun [30] estimated the spatial panel data model and the SAR model with an endogenous spatial weight matrix in a nonparametric way, respectively. Moreover, Qu and Lee [36] proposed estimators for the parametric SAR model with an endogenous spatial covariate. This paper only deals with the endogeneity of the spatial lag of the dependent variable, as our spatial weight matrix is non-stochastic.

The endogeneity problem can be addressed by using the maximum likelihood estimation (or MLE) method, as well as the instrumental variable (or IV) approach. Ord [37] was the first to examine the MLE of SAR models. He proposed to use the eigenvalues of the spatial weights matrix to alleviate the computational complexity of the MLE method in large sample sizes. Lee [38] derived the large sample properties of the quasi-MLE without a normality assumption on error terms, while Bao and Ullah [39] obtained the second order bias of the maximum likelihood estimator for spatial autoregressive models. As the (quasi-) maximum likelihood estimator can be computationally difficult in moderate or large-sized samples, Kelejian and Prucha [40] proposed a two-stage least squares (or 2SLS) estimator for a SAR model with spatial autoregressive errors, while Lee [41] proposed an asymptotically-optimal 2SLS estimator. As the spatial weights in Model (1) are unknown, the 2SLS estimation methods derived in [40,41] are not feasible; we therefore use a series approximation method to recover the unknown spatial weight function and estimate all unknown functions via a nonparametric 2SLS (or NP2SLS) estimation method. For an overview of the sieve estimation method, see [42].

Specifically, we approximate the unknown weighting functions $g(\xb7)$ and ${m}_{t}(\xb7)$, $t=1,\dots ,p$, and the vector of functional coefficients $\theta (\xb7)$ by series expansions:
and:
respectively, where $\alpha ={({\alpha}_{1},{\alpha}_{2},\dots ,{\alpha}_{{L}_{n}})}^{T},{\gamma}_{t}={({\gamma}_{t1},{\gamma}_{t2},\dots ,{\gamma}_{t{L}_{n}})}^{T}$ and ${\beta}_{t}={({\beta}_{t1},{\beta}_{t2},\dots ,{\beta}_{t{L}_{n}})}^{T}$ for $t=1,\dots ,p$ are all ${L}_{n}\times 1$ vectors of unknown coefficients, ${\left\{{\varphi}_{j}(\xb7)\right\}}_{j=1}^{{L}_{n}}$ is a sequence of square integrable orthonormal basis functions over the interval $[0,\infty )$ and ${L}_{n}$ denotes the number of basis functions. The following assumption regulates the sparseness of the weight matrix and the smoothness of unknown functions.

$${g}^{*}\left(z\right)=\sum _{l=1}^{{L}_{n}}{\alpha}_{l}{\varphi}_{l}\left(z\right),$$

$${m}^{*}\left(z\right)={[{m}_{1}^{*}\left(z\right),\dots ,{m}_{p}^{*}\left(z\right)]}^{T}=[\sum _{l=1}^{{L}_{n}}{\gamma}_{1l}{\varphi}_{l}\left(z\right),\sum _{l=1}^{{L}_{n}}{\gamma}_{2l}{\varphi}_{l}\left(z\right),\dots ,\sum _{l=1}^{{L}_{n}}{\gamma}_{pl}{\varphi}_{l}\left(z\right){]}^{T},$$

$${\theta}^{*}\left(d\right)={[{\theta}_{1}^{*}\left(d\right),\dots ,{\theta}_{p}^{*}\left(d\right)]}^{T}=[\sum _{l=1}^{{L}_{n}}{\beta}_{1l}{\varphi}_{l}\left(d\right),\sum _{l=1}^{{L}_{n}}{\beta}_{2l}{\varphi}_{l}\left(d\right),\dots ,\sum _{l=1}^{{L}_{n}}{\beta}_{pl}{\varphi}_{l}\left(d\right){]}^{T},$$

**Assumption A3:**(i) There exists a positive constant sequence $\left\{{v}_{n}\right\}$, such that:

$$\underset{1\le l\le {L}_{n},1\le i\le n}{max}\sum _{j\ne i}^{n}|{\varphi}_{l}\left({Z}_{ij}\right)|\le C{v}_{n}\phantom{\rule{4.pt}{0ex}}\text{and}\phantom{\rule{4.pt}{0ex}}\underset{1\le l\le {L}_{n},1\le j\le n}{max}\sum _{i=1}^{n}\left|{\varphi}_{l}\left({Z}_{ij}\right)\right|\le C{v}_{n}.$$

$$\underset{1\le i\le n}{max}\sum _{j\ne i}|g\left({Z}_{ij}\right)-{\alpha}^{T}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right)|=O({L}_{n}^{-\zeta})$$

$$\underset{1\le t\le p}{max}\underset{1\le i\le n}{max}\sum _{j\ne i}|{m}_{t}\left({Z}_{ij}\right)-{\gamma}_{t}^{T}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right)|=O({L}_{n}^{-\zeta})$$

$$\underset{d\in R}{sup}\underset{1\le t\le p}{max}|{\theta}_{t}\left(d\right)-{\beta}_{t}^{T}{\Phi}_{{L}_{n}}\left(d\right)|=O({L}_{n}^{-\zeta}),$$

It is not necessary to know the exact order of ${\upsilon}_{n}$ in Assumption A3 (i), and the consistency of our proposed estimator does not require ${\upsilon}_{n}\equiv 1$, as assumed in [27]’s Assumption (vi). From approximation theory in mathematics, Assumption A1 (ii) is a necessary condition for Assumption A3 (ii). However, (8) and (9) also require spatial units expanding sparsely as more spatial units are included, for example when (3) holds true; and the consistency of our proposed estimator relies on increasing domain asymptotic theory. Moreover, we use Laguerre polynomial series to approximate the unknown functions, as it is one of the common choices for series expansions when a function has a domain over $[0,\infty )$ ([42] (p. 5574)). In addition, ${L}_{n}$ acts as a smoothing parameter that increases slowly with the sample size. In other words, it is required to have ${L}_{n}\to \infty $ and ${L}_{n}/n\to 0$ as $n\to \infty $. An introduction of series estimation methods in a nonparametric framework can be found in [43] (Chapter 15).

Now, we approximate Model (1) by:
To derive our first-step estimator, we rewrite Model (10) in matrix form as follows:
where we denote the i
and a $\left[(2p+1){L}_{n}\right]\times 1$ vector of parameters $\xi ={[{\alpha}^{T},{\gamma}_{1}^{T},{\gamma}_{2}^{T},\dots ,{\gamma}_{p}^{T},{\beta}_{1}^{T},{\beta}_{2}^{T},\dots ,{\beta}_{p}^{T}]}^{T}$.

$${Y}_{i}\approx \sum _{l=1}^{{L}_{n}}{\alpha}_{l}\sum _{j\ne i}{\varphi}_{l}\left({Z}_{ij}\right){Y}_{j}+\sum _{t=1}^{p}\sum _{l=1}^{{L}_{n}}{\gamma}_{tl}\sum _{j\ne i}{\varphi}_{l}\left({Z}_{ij}\right){X}_{tj}+\sum _{t=1}^{p}\sum _{l=1}^{{L}_{n}}{\beta}_{tl}{\varphi}_{l}\left({D}_{i}\right){X}_{ti}+{u}_{i},i=1,\dots ,n.$$

$$Y\approx {V}_{n}\xi +U,$$

^{th}row vector of an $n\times \left[(2p+1){L}_{n}\right]$ matrix ${V}_{n}$ by:
$$\begin{array}{c}{V}_{n,i}^{T}=[{(\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){Y}_{j})}^{T},{(\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){X}_{1j})}^{T},\dots ,{(\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){X}_{pj})}^{T},\hfill \\ \hfill {\left({\Phi}_{{L}_{n}}\left({D}_{i}\right){X}_{1i}\right)}^{T},\dots ,{\left({\Phi}_{{L}_{n}}\left({D}_{i}\right){X}_{pi}\right)}^{T}]\end{array}$$

The specification of the instrumental variable matrix is of great importance to obtain a consistent estimator. Since the number of endogenous variables increases with the number of approximating functions, ${L}_{n}$, it is intuitively appealing to instrument the endogenous variables, ${\sum}_{j\ne i}{\varphi}_{l}\left({Z}_{ij}\right){Y}_{j}$, $l=1,\dots ,{L}_{n}$, by $({\sum}_{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){X}_{{t}_{1}j}){X}_{{t}_{2}i}$ and ${\sum}_{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){D}_{j}$, ${t}_{1},{t}_{2}\in \{1,2,\dots ,p\}$ for $p>1$ as in our empirical application; see, e.g., [44]. Since ${X}_{{t}_{1}j}{X}_{{t}_{2}i}$ and ${D}_{j}$ are exogenous and relevant in predicting ${Y}_{j}$, we would expect the proposed instrumental variables to serve as valid instruments for ${\sum}_{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){Y}_{j}$. Therefore, we define the i
We then can estimate ξ from (11) by the 2SLS estimation method. Note that we do not pursue optimal instrument variables in this paper due to the complexity of this approach in the semiparametric setup and the fact that the oracle efficiency of the second-step estimator of $\theta \left(\xb7\right)$ does not rely on the use of optimal instruments in the first-step estimation.

^{th}row vector of an $n\times \left[(2p+2){L}_{n}\right]$ instrumental matrix ${Q}_{n}$ as:
$$\begin{array}{c}{Q}_{n,i}^{T}=[{((\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){X}_{1j}){X}_{1i})}^{T},{(\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){X}_{1j})}^{T},\dots ,{(\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){X}_{pj})}^{T},\hfill \\ \hfill {\left({\Phi}_{{L}_{n}}\left({D}_{i}\right){X}_{1i}\right)}^{T},\dots ,{\left({\Phi}_{{L}_{n}}\left({D}_{i}\right){X}_{pi}\right)}^{T},{(\sum _{j\ne i}{\Phi}_{{L}_{n}}\left({Z}_{ij}\right){D}_{j})}^{T}].\end{array}$$

To ensure the existence of our 2SLS estimator, we assume that the exogenous regressors matrix ${X}_{n}$, the instrumental variables matrix ${Q}_{n}$ and ${V}_{n}^{T}{Q}_{n}{\left({Q}_{n}^{T}{Q}_{n}\right)}^{-1}{Q}_{n}^{T}{V}_{n}$ all have full column rank. Moreover, for the relevance of the instruments, we assume that $E\left[{Q}_{n}^{T}{V}_{n}\right]$ has a full column rank. Otherwise, we can remove linearly-dependent terms as long as the number of instruments in ${Q}_{n}$ is more than the number of endogenous variables ${L}_{n}$ plus the number of exogenous regressors $2p$. Lee [45] (p. 493) argues that the 2SLS estimator would be inconsistent if $\left({X}_{i},{D}_{i}\right)$ are both irrelevant in predicting $\left\{{Y}_{i}\right\}$. Therefore, throughout this paper, we assume that X and D contain relevant variables in predicting $\left\{{Y}_{i}\right\}$ and $\theta \left(\xb7\right)$ takes non-zero values over any non-empty interval, so that there is no need to use the quadratic moments as additional orthogonal relations, as suggested in [45]. Our empirical application in this paper satisfies this assumption by both economic theory and empirical findings observed from the economic growth literature.

To construct a consistent estimator for $g(\xb7)$, ${m}_{t}(\xb7)$ and ${\theta}_{t}(\xb7)$, $t=1,\dots ,p$, we consider the following nonparametric 2SLS objective function:

$$\underset{\xi}{min}{\left[{Q}_{n}^{T}(Y-{V}_{n}\xi )\right]}^{T}\left[{Q}_{n}^{T}(Y-{V}_{n}\xi )\right].$$

The nonparametric 2SLS estimator of ξ solves (12) and is given by:
and hence, the corresponding nonparametric 2SLS estimators 1 of unknown functions are given by:
and

$$\widehat{\xi}={\left[{V}_{n}^{T}{Q}_{n}{\left({Q}_{n}^{T}{Q}_{n}\right)}^{-1}{Q}_{n}^{T}{V}_{n}\right]}^{-1}{V}_{n}^{T}{Q}_{n}{\left({Q}_{n}^{T}{Q}_{n}\right)}^{-1}{Q}_{n}^{T}Y,$$

$$\widehat{g}\left(z\right)=\sum _{l=1}^{{L}_{n}}{\widehat{\alpha}}_{l}{\varphi}_{l}\left(z\right),$$

$$\widehat{m}\left(z\right)={[{\widehat{m}}_{1}\left(z\right),\dots ,{\widehat{m}}_{p}\left(z\right)]}^{T}=[\sum _{l=1}^{{L}_{n}}{\widehat{\gamma}}_{1l}{\varphi}_{l}\left(z\right),\sum _{l=1}^{{L}_{n}}{\widehat{\gamma}}_{2l}{\varphi}_{l}\left(z\right),\dots ,\sum _{l=1}^{{L}_{n}}{\widehat{\gamma}}_{pl}{\varphi}_{l}\left(z\right){]}^{T},$$

$$\widehat{\theta}\left(d\right)={[{\widehat{\theta}}_{1}\left(d\right),\dots ,{\widehat{\theta}}_{p}\left(d\right)]}^{T}=[\sum _{l=1}^{{L}_{n}}{\widehat{\beta}}_{1l}{\varphi}_{l}\left(d\right),\sum _{l=1}^{{L}_{n}}{\widehat{\beta}}_{2l}{\varphi}_{l}\left(d\right),\dots ,\sum _{l=1}^{{L}_{n}}{\widehat{\beta}}_{pl}{\varphi}_{l}\left(d\right){]}^{T}.$$

Next, we propose a second-step estimator for the functional coefficients, $\theta \left(d\right)$, using the local linear regression approach. We would expect the local linear estimate of $\theta \left(d\right)$, $\tilde{\theta}\left(d\right)$, to have an improvement over the first-step estimator, $\widehat{\theta}\left(d\right)$. Sun [30] considered a semiparametric spatial autoregressive model that has a mathematical representation of Model (1) with ${M}_{n,t}=0$ for all $t=1,\dots ,p$ and has recently shown that the local linear estimator of $\theta (\xb7)$ can be oracle efficient under some regularity conditions in the sense that its limiting distribution does not depend on whether or not the spatial weights are known. This is a general result from non-/semi-parametric additive models. As the unknown functions $\left(g\left({Z}_{ij}\right),{m}_{1}\left({Z}_{ij}\right),\dots ,{m}_{p}\left({Z}_{ij}\right)\right)$ and $\theta \left({D}_{i}\right)$ enter Model (1) additively, we expect that $\tilde{\theta}\left(d\right)$ is oracle efficient, as well. We assume that as $n\to \infty $, $h\to 0$, $nh\to \infty $ and $n{h}^{5}\to c\in (0,\infty )$, where h is the bandwidth, which controls the size of the local neighbourhood around an interior point d. Moreover, let $K(\xb7)$ be a kernel function, which assigns more weights to the data closer to point d, satisfying: (i) $\int K\left(a\right)da=1$; (ii) $K\left(a\right)=K(-a)$; and (iii) $\int {a}^{2}K\left(a\right)da>0$.

The estimation procedure for $\tilde{\theta}\left(d\right)$ is given as follows:

(i) We replace $g\left(z\right)$ and ${m}_{t}\left(z\right)$ in (1) by $\widehat{g}\left(z\right)$ and ${\widehat{m}}_{t}\left(z\right)$, respectively, and treat ${\widehat{Y}}_{i}={Y}_{i}-{\sum}_{j\ne i}\widehat{g}\left({Z}_{ij}\right){Y}_{j}-{\sum}_{t=1}^{p}{\sum}_{j\ne i}{\widehat{m}}_{t}\left({Z}_{ij}\right){X}_{tj}$ as the dependent variable.

(ii) Applying the first-order Taylor series expansion of $\theta \left(D\right)$ around d, $\theta \left(D\right)\approx \theta \left(d\right)+{\theta}^{\prime}\left(d\right)(D-d)$, we calculate the local linear estimator from a minimization of a kernel-weighted objective function:
where $\tilde{\theta}\left(d\right)$ estimates $\theta \left(d\right)$ and ${\tilde{\theta}}^{\prime}\left(d\right)$ estimates ${\theta}^{\prime}\left(d\right)$, the first order derivative of $\theta \left(d\right)$.

$$\left(\tilde{\theta}\left(d\right),{\tilde{\theta}}^{\prime}\left(d\right)\right)=arg\underset{\theta \left(d\right),{\theta}^{{}^{\prime}}\left(d\right)}{min}\sum _{i=1}^{n}{[{\widehat{Y}}_{i}-{X}_{i}^{T}\theta \left(d\right)-{X}_{i}^{T}{\theta}^{\prime}\left(d\right)({D}_{i}-d)]}^{2}K(({D}_{i}-d)/h)$$

For a complete treatment of local linear estimator, see [48]. As the mathematical proofs of the consistency of the first-step estimator of ${\left(\widehat{g}\left(z\right),{\widehat{m}}_{1}\left(z\right),\dots ,{\widehat{m}}_{p}\left(z\right),\widehat{\theta}{\left(d\right)}^{T}\right)}^{T}$ and the limiting result of the second-step estimator $\tilde{\theta}\left(d\right)$ closely follow those given in [30], the proofs are omitted from the paper.

## 4. Monte Carlo Simulations

In this section, we present the results from a very small Monte Carlo simulation study to assess the finite-sample properties of our estimators and more simulation results can be obtained from the authors upon request. We generate the data from the following regression model:
where we randomly draw ${u}_{i}\sim $ i.i.d.$N(0,0.5)$, ${D}_{i}\sim $ i.i.d.$U[0,1]$ and ${X}_{i}$ $=0.5{D}_{i}$ $+{\eta}_{i}$ with ${\eta}_{i}\sim $ i.i.d.$N(0,1)$ independent of $\left\{{u}_{i}\right\}$. For the exogenous variable, Z, we first randomly generate n observations from the $U[0,{R}_{n}]$ distribution with ${R}_{n}=0.001{n}^{1.6}$, by which we control the sparseness of spatial units. Then, we calculate ${Z}_{ij}$ as the absolute distance between observations i and j. The specification of spatial weight functions requires that $g(\xb7)$ and $m(\xb7)$ are both decreasing and non-negative functions. We therefore set $g\left(z\right)=m\left(z\right)=0.01exp(-z/0.01)$ for $z>0$ with $g\left(0\right)=0$ and $m\left(0\right)=0$. The random variables, ${u}_{i}$, ${D}_{i}$ and ${Z}_{i}$, are all mutually independent.

$${Y}_{i}=\sum _{j\ne i}g\left({Z}_{ij}\right){Y}_{j}+\sum _{j\ne i}m\left({Z}_{ij}\right){X}_{j}+{X}_{i}exp(-{\left(4{D}_{i}-1\right)}^{2})+{u}_{i},i=1,2,\dots ,n,$$

We consider a sample size $n\in \{100,200,400\}$. The number of replications is 1000 for each n in the Monte Carlo experiments. Moreover, we set ${L}_{n}=1,2,3$ for each sample size, respectively. In the second-step estimation of coefficient functions, we select the bandwidth via a cross-validation method and use the Gaussian kernel function. To measure the performance of the estimators, we compute the root mean squared errors (or RMSEs) for each simulation. In Table 1, we report the averages of the RMSEs computed over 1000 repetitions, where $\widehat{g}(\xb7)$, $\widehat{m}(\xb7)$ and $\widehat{\theta}(\xb7)$ denote the NP2SLS estimators of $g(\xb7)$, $m(\xb7)$ and $\theta (\xb7)$, respectively; $\tilde{\theta}(\xb7)$ is the second-step estimator of $\theta (\xb7)$, and $\overrightarrow{\theta}(\xb7)$ is the local linear estimator of $\theta (\xb7)$, while $g(\xb7)$ and $m(\xb7)$ are known. Furthermore, we also estimate the average direct impact (ADI) and the average indirect impact (AII) and report their corresponding RMSEs in Table 1. Specifically, we first obtain the reduced form model from Equation (17):
where Y, X, $\theta \left(D\right)$ and U are all $n\times 1$ vectors, and “$\odot $” denotes the Hadamard multiplication. Then, the expected marginal effect of X is given by the following $n\times n$ matrix:
from which we obtain ADI$={n}^{-1}tr\left\{S\left({G}_{n},{M}_{n},D\right)\right\}$ and AII$={n}^{-1}{\mathbf{i}}_{n}^{\prime}S\left({G}_{n},{M}_{n},D\right){\mathbf{i}}_{n}-$ADI; see LeSage and Pace [10], where ${\mathbf{i}}_{n}$ is the $n\times 1$ vector of ones and $diag\left\{\theta \left(D\right)\right\}$ is a diagonal matrix. Replacing the two unknown spatial weight matrices and $\theta \left(D\right)$ by their estimates, we obtain the estimates for ADI and AII.

$$Y={({I}_{n}-{G}_{n})}^{-1}[{M}_{n}X+\theta \left(D\right)\circ X+U],$$

$$\frac{\partial E\left(Y\right|X,D)}{\partial X}\equiv {({I}_{n}-{G}_{n})}^{-1}[{M}_{n}+diag\left\{\theta \left(D\right)\right\}]=S\left({G}_{n},{M}_{n},D\right)$$

From Table 1, we observe that there is a decrease in the RMSEs for all three estimators as the sample size increases in each design. Moreover, the second-step estimator always performs better than the nonparametric 2SLS estimator. The relative ratios of the RMSEs of the second-step estimator and $\overrightarrow{\theta}(\xb7)$ generally reduce as the sample size increases. Therefore, our simulation results support the consistency of our proposed estimators. As for the ADI and AII, we also see an overall decreasing pattern in the RMSEs as the sample size increases, where the AII is less accurately estimated than the ADI, as the former is calculated from $n\left(n-1\right)$ terms and the latter is calculated from n elements only.

n | $\widehat{\mathit{g}}(\xb7)$ | $\widehat{\mathit{m}}(\xb7)$ | $\widehat{\mathit{\theta}}(\xb7)$ | $\tilde{\mathit{\theta}}(\xb7)$ | $\overrightarrow{\mathit{\theta}}(\xb7)$ | ADI | AII |
---|---|---|---|---|---|---|---|

100 | 0.2109 | 0.0734 | 0.3535 | 0.2159 | 0.1557 | 0.1061 | 0.4386 |

200 | 0.1901 | 0.0593 | 0.2579 | 0.1680 | 0.1168 | 0.0761 | 0.2913 |

400 | 0.0500 | 0.0160 | 0.2013 | 0.1001 | 0.0859 | 0.0432 | 0.2989 |

## 5. Empirical Application

Monte Carlo simulations results given in Section 4 support the consistency of our proposed estimation method. We are now in a position to re-investigate cross-country growth patterns. We want to evaluate the impact of a country’s initial income, savings rate, population growth rate and openness, as well as neighbour countries’ economic growth spillovers on a country’s economic growth rate. We follow [4] in using a sample of 91 countries listed in [1], which is the Heston-Summers data taken from Penn World Table 6.1. Consider the following conditional convergence Solow growth model 2:
for $i=1,2,\dots ,91$, where ${Y}_{i}$ is the i

$$\begin{array}{ccc}\hfill {Y}_{i}& =& \sum _{j\ne i}g\left({Z}_{ij}\right){Y}_{j}+\sum _{j\ne i}m\left({Z}_{ij}\right)ln\left(y{60}_{j}\right)+{\theta}_{1}\left(lope{n}_{i}\right)+{\theta}_{2}\left(lope{n}_{i}\right)ln\left(y{60}_{i}\right)\hfill \\ & & +{\theta}_{3}\left(lope{n}_{i}\right)ln\left({s}_{i}\right)+{\theta}_{4}\left(lope{n}_{i}\right)ln({n}_{i}+0.05)+{u}_{i},\phantom{\rule{4.pt}{0ex}}\hfill \end{array}$$

^{th}country’s average growth rate of real GDP per capita between 1960 and 1995, $y{60}_{i}$ is the i^{th}country’s initial real GDP per capita in 1960, ${s}_{i}$ is the i^{th}country’s average saving rate, ${n}_{i}$ is the i^{th}country’s average growth rate of working-age population (ages between 15 and 64), $lope{n}_{i}$ is a scalar development index of a country defined as the logarithm of the i^{th}country’s average ratio of total imports plus exports over its real GDP over the period from 1960 to 1995, and ${Z}_{ij}$ is the great-circle distance between i^{th}and j^{th}countries’ capitals 3.We approximate $g(\xb7)$, $m(\xb7)$ and ${\theta}_{i}(\xb7)$, $i=1,2,3,4$, using the Laguerre polynomials with ${L}_{n}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}2$. Moreover, a cross-validation selected bandwidth, ${h}_{opt}$, is calculated as 0.5285. We obtained $\rho \left({\widehat{G}}_{n}\right)=0.041$, which suggests a spatial stationarity in the data. The distribution of the estimated residuals is approximately normal as the q-q plot of estimated residuals is close to linear. The coefficient estimates, ${\tilde{\theta}}_{1}(\xb7)$, ${\tilde{\theta}}_{2}(\xb7)$, ${\tilde{\theta}}_{3}(\xb7)$ and ${\tilde{\theta}}_{4}(\xb7)$ are presented in Figure 1, where the solid lines with circles display the second-step estimates. The dashed lines represent the estimates of the spatially-augmented Solow growth model of [4] using the inverse power spatial weight function, which we include as a baseline. We interpret Figure 1 as follows. First, in Figure 1b, we see that there is a negative relation between the initial level of income and the economic growth rate, except for Mauritius, Hong Kong, Zambia, Cameroon and Singapore, which confirms a conditional β-convergence hypothesis. Moreover, we observe that ${\tilde{\theta}}_{2}(\xb7)$ is increasing in openness, which, however, results in a gradually declining degree of convergence. In addition, we see that the nonparametric model reveals slightly weaker conditional economic growth convergence as compared to the parametric model.

Second, in Figure 1c, we see that ${\tilde{\theta}}_{3}(\xb7)$ exhibits a positive, but not a monotonic, relation between the real investment rate and the real GDP per capita growth rate. Our estimate of the coefficient of the investment rate fluctuates as the trade openness of countries increases. For the economies with a trade openness higher than 15% of GDP, our result indicates that the nonparametric model sees stronger positive impact of the investment rate on the real GDP per capita growth rate than the parametric model does. Third, in Figure 1d, it is observed that the population growth rate has a negative impact on the real GDP per capita growth rate. For the countries whose trade openness ranges between 29% and 65%, our estimates for the coefficient of the population growth rate are relatively flat. Moreover, we note that the magnitude of the negative effect of the population growth rate is getting larger as the trade openness of countries increases, especially when the trade openness is over 65% of GDP. Overall, Figure 1 can be interpreted as the fact that an open economy suffers from higher negative impact of the population growth rate, but at the same time takes the advantage of high initial real GDP per capita.

Next, due to the cross-country interactions through spatial weights, the functional coefficient estimates have a different interpretation than the one obtained from the non-spatial model. In order to correctly interpret these estimates, we rewrite the estimated model in a reduced from as follows:
where $\tilde{U}$ is the $n\times 1$ vector of residuals. Then, from Equation (20), the marginal effects are given by the following $n\times n$ matrices:
where we define $diag\left\{a\right\}$ as an $n\times n$ diagonal matrix with the elements of an $n\times 1$ vector, a, on the main diagonal. Following LeSage and Pace [10], we label the diagonal elements of each matrix given above as the direct impacts and off-diagonal elements as the indirect impacts.

$$\begin{array}{ccc}\hfill Y& =& {({I}_{n}-{\widehat{G}}_{n})}^{-1}[{\widehat{M}}_{n}ln\left(y60\right)+{\tilde{\theta}}_{1}\left(lopen\right)+{\tilde{\theta}}_{2}\left(lopen\right)\odot ln\left(y60\right)\hfill \\ & & +{\tilde{\theta}}_{3}\left(lopen\right)\odot ln\left(s\right)+{\tilde{\theta}}_{4}\left(lopen\right)\odot ln(n+0.05)+\tilde{U}],\hfill \end{array}$$

$$\begin{array}{cc}\hfill \frac{\partial E\left(Y\right|y60,s,n)}{\partial ln\left(y{60}^{T}\right)}& \equiv {({I}_{n}-{G}_{n})}^{-1}[{M}_{n}+diag\left\{{\theta}_{2}\left(lopen\right)\right\}]\hfill \\ \hfill \frac{\partial E\left(Y\right|y60,s,n)}{\partial ln\left({s}^{T}\right)}& \equiv {({I}_{n}-{G}_{n})}^{-1}diag\left\{{\theta}_{3}\left(lopen\right)\right\}\hfill \\ \hfill \frac{\partial E\left(Y\right|y60,s,n)}{\partial ln({n}^{T}+0.05)}& \equiv {({I}_{n}-{G}_{n})}^{-1}diag\left\{{\theta}_{4}\left(lopen\right)\right\}\hfill \end{array}$$

In Table 2, we report the estimated average direct impact (ADI) and average indirect impact (AII) of the explanatory variables, where the latter can be easily defined as the difference between average total impact 4 and the average direct impact. Average direct and indirect impacts from the parametric model of [4] are denoted as ADI${}_{EK}$ and AII${}_{EK}$, respectively. The interpretation of Table 2 is as follows. Firstly, we observe that a 1% increase in the real initial GDP per capita of an economy, holding other factors fixed, results in a decrease by 0.5% in its own real GDP per capita growth rate. However, this change increases the rest of the economies’ economic growth rates by 0.01% on average due to the spatial dependence. From another point of view, a 1% increase in all of the regions’/nations’ initial real GDP per capita speeds up this economy’s real GDP per capita growth by 0.01%. This result indicates a positive spillover effect of the initial level of income. Secondly, a 1% increase in this economy’s real investment rate increases its own real GDP growth rate by 2.08% on average. However, this change slows down the rest of the nations’ real GDP growth by 0.08% on average. Thirdly, we see that a 1% increase in the population growth rate of this economy retards its own economic growth by 3.8% on average, but helps to improve the rate of economic growth of the rest of the countries by 0.16% on average.

Fourthly, when comparing our results to [4]’s results, we find that both the nonparametric and the parametric model give almost the same average direct effects of the initial per capita income, investment rate and population growth rate on the economic growth rate and that both models result in the same signs in the average direct and indirect effects. However, the AII values from the nonparametric model are much smaller than the results from the parametric model in absolute value, especially for the initial per capita income and the population growth rate. This is not surprising, as the parametric model assumes that all of the spatial weights take non-negative values, while our nonparametric spatial weights are estimated from the data without such a restriction. Although it is popular practice to assume non-negative spatial weights, this is an assumption imposed without support from econometric or economic theory. For example, trade treaties and monetary policies are both double-edged swords that may bring opposite impacts to different national economies, and non-negative spatial weights may not be able to capture the opposite interactions among different economies. As both [4]’s parametric SDM and our proposed semiparametric SDM approximate the unknown true relationships in their own best capacity, however, our model imposes less restrictions than the parametric SDM and is believed to bring a better fit to the data and more reliable inference. Although the numbers are different, both models give the same sign in estimated direct and indirect effects. Overall, we observe that the average direct and indirect effects can take opposite signs, and the effect of the former is much stronger than that of the latter in absolute magnitude.

ln(y60) | ln(s) | ln(n + 0.05) | |
---|---|---|---|

ADI | −0.0050 | 0.0208 | −0.0381 |

(−0.0122, −0.0005) | (0.0089, 0.0341) | (−0.0782, −0.0054) | |

ADI${}_{EK}$ | −0.0119 | 0.0184 | −0.0336 |

(−0.0159, −0.0078) | (0.0139, 0.0229) | (−0.0585, −0.0094) | |

AII | 0.0001 | −0.0008 | 0.0016 |

(−0.0077, 0.0056) | (−0.0174, 0.0190) | (−0.0347, 0.0289) | |

AII${}_{EK}$ | 0.0140 | −0.0018 | 0.0275 |

(0.0052, 0.0244) | (−0.0169, 0.0124) | (−0.0321, 0.0860) |

Note: A 95% bootstrap percentile confidence interval is given in the parenthesis.

LeSage and Pace [10] (p. 39) explain how to obtain the standard errors for the ADI and AII estimates via a simulation method. In the parametric setup, as the spatial weight matrices are known, theoretically, one can apply the delta method to obtain the standard errors, and the simulation method tends to provide at best an approximation as one does not know the exact distribution of the estimated coefficients in finite samples; however, this method is feasible as the average direct and indirect impacts only depend on a finite number of unknown parameters. As our proposed semiparametric model contains both unknown spatial weights and unknown coefficient curves, the simulation method would involve simulating from a joint distribution with dimension equal to $2n\left(n-1\right)$ (two spatial weight matrix estimates) plus $4n$ (four coefficient curve estimates) or 16,744 in our empirical application. Therefore, the simulation method is infeasible for our empirical interest here. As both the ADI and AII are in the form of sample averages and it is well known that the bootstrap method can be used to estimate the sample average and its standard error well (e.g., [49]), we decide to report our bootstrap estimates of the confidence intervals for the ADI and AII.

Below, we explain a residual-based bootstrap method to test whether the ADI and AII are significantly different from zero at the 5% significance level. We use the nonparametric bootstrap percentile method to construct a 95% confidence interval. Following [50,51], we first estimate the functional coefficients using an oversmoothed bandwidth, which tends to zero at a slower speed than the optimal bandwidth. Then, we obtain estimated residuals. The rest of the bootstrap procedure is given below.

- Resample the estimated residuals and obtain the bootstrap errors, ${U}^{b}$.
- Calculate:$$\begin{array}{c}{Y}^{b}={({I}_{n}-{\widehat{G}}_{n})}^{-1}[{\widehat{M}}_{n}ln\left(y60\right)+{\tilde{\theta}}_{1}^{*}\left(lopen\right)+{\tilde{\theta}}_{2}^{*}\left(lopen\right)\odot ln\left(y60\right)+\hfill \\ \hfill {\tilde{\theta}}_{3}^{*}\left(lopen\right)\odot ln\left(s\right)+{\tilde{\theta}}_{4}^{*}\left(lopen\right)\odot ln(n+0.05)+{U}^{b}],\end{array}$$
- Estimate Model (19) from the bootstrap sample, and record ${\tilde{\theta}}_{1}^{b}(\xb7)$, ${\tilde{\theta}}_{2}^{b}(\xb7)$, ${\tilde{\theta}}_{3}^{b}(\xb7)$ and ${\tilde{\theta}}_{4}^{b}(\xb7)$, the bootstrap estimates of the functional coefficients.
- Calculate the bootstrap value of average direct and indirect impacts of the explanatory variables, $AD{I}^{b}$ and $AI{I}^{b}$, respectively.
- Repeat Steps 1–4 for 999 times.
- Find the 0.025th and 0.975th empirical percentile of the 999 bootstrap values of ADI and AII and the point estimates given in Table 2 to establish the 95% bootstrap percentile confidence interval.

The confidence intervals are reported in Table 2. We see that the ADI values are statistically significantly different from zero at the 5% significance level. Moreover, we find that there is no significant effect on average from neighbouring countries per capita initial income, savings rate and population growth rate on economic growth rate of the country of interest. The same inference is obtained for the parametric model, except that the parametric model sees a significant average indirect impact of initial per capita income. Note that insignificant AIIs do not imply that the indirect impact from economy i on economy j is insignificant for all $\left(i,j\right)$.

Figure 2 presents estimated spatial weighting functions. We plot both estimated spatial weighting functions, $\widehat{g}\left(\xb7\right)$ and $\widehat{m}\left(\xb7\right)$, for the geographic distances ranging from zero to 20 in 100 km, as the estimated spatial weights in the absolute value have an average of 5.069 × 10${}^{-7}$ and 3.893 × 10${}^{-9}$, respectively, when $z>20$. In Figure 2a,b, firstly, we see negative spatial weights, which greatly contradicts traditional parametric spatial regression models, which often assume non-negative spatial weights. Negative spatial interactions are indeed common in practice, especially in social networks; see [52,53] for strategic interactions within the monetary policy committee of the Bank of England. Secondly, both spatial weight functions are not strictly monotonic and exhibit convexity among nations that are not very far apart, concavity among nations with moderately far distances and a zero spatial weight function among nations that are far away. Moreover, the spatial weight functions take bigger absolute values among nations with smaller distance apart and smaller absolute values among far-away nations, which implies a relatively larger economic interaction among nearby nations than among far-away nations. Lastly, in Figure 2b, we observe positive estimated spatial weights when the distance ranges between 0.229 and 1.894, which correspond to 22.9 and 189.4 km, respectively. For the distances greater than 189.4 km, negative spatial weights are getting closer to zero as the distance between two countries increases. As the turning point 1.894 is really small and occurs for countries with a small area, our results imply that spatial interaction is very strong and different between two nearby small countries with small areas than between two countries with longer distances when at least one country has a large area.

## 6. Conclusions

We employ a spatial Durbin model combined with the nonparametric spatial weighting functions, as well as the unknown functional coefficients to estimate the augmented Solow growth model with a sample of 91 countries over the period 1960 to 1995. We find a negative spatial lag effect of neighbouring country’s GDP per capita growth rate and initial GDP per capita on the economic growth rate of country i. These effects are declining in magnitude as the geographical distance between the two countries increases. Finally, allowing coefficients as a function of trade openness of a country enables us to see the true country-specific effect of each determinant of economic growth. Moreover, we find significant average direct impact from each production factors. However, our findings show that the average indirect impact of these variables is insignificant at the 5% significance level.

## Acknowledgments

We would like to thank two anonymous referees for their insightful and instructive comments. We are also grateful for comments by participants at the 8th Midwest Graduate Student Summit hosted by Purdue University at Lafayette in April 2015, the 49th Annual Conference of the Canadian Economics Association hosted by Ryerson University in Toronto in May 2015 and the 32nd Meeting of the Canadian Econometric Study Group hosted by University of Guelph at Guelph in September 2015.

## Author Contributions

The authors contributed equally to the paper.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

- N.G. Mankiw, D. Romer, and D.N. Weil. “A contribution to the empirics of economic growth.” Q. J. Econ. 107 (1992): 407–437. [Google Scholar] [CrossRef]
- D. Backus, E. Henriksen, and K. Storesletten. “Taxes and the global allocation of capital.” J. Monet. Econ. 55 (2008): 48–61. [Google Scholar] [CrossRef]
- B.H. Baltagi, P. Eggerb, and M. Pfaffermayr. “Estimating regional trade agreement effects on FDI in an interdependent world.” J. Econom. 145 (2008): 194–208. [Google Scholar] [CrossRef]
- C. Ertur, and W. Koch. “Growth, technological interdependence and spatial externalities: Theory and evidence.” J. Appl. Econom. 22 (2007): 1033–1062. [Google Scholar] [CrossRef]
- C. Ertur, and W. Koch. “A contribution to the theory and empirics of Schumpeterian growth with worldwide interactions.” J. Econ. Growth 16 (2011): 215–255. [Google Scholar] [CrossRef]
- A. Cassette, J. Creel, E. Farvaque, and S. Paty. “Governments under influence: Country interactions in discretionary fiscal policy.” Econ. Model. 30 (2013): 79–89. [Google Scholar] [CrossRef]
- J.K. Brueckner. “Strategic interaction among governments: An overview of empirical studies.” Int. Reg. Sci. Rev. 26 (2003): 175–188. [Google Scholar] [CrossRef]
- A. Cliff, and J.K. Ord. Spatial Processes: Models & Applications. London, UK: Pion, 1981. [Google Scholar]
- L. Anselin. Spatial Econometrics: Methods and Models. Dordrecht, The Netherlands: Kluver Acedemic Publishers, 1988. [Google Scholar]
- J.P. LeSage, and R.K. Pace. Introduction to Spatial Econometrics. New York, NY, USA: Chapman and Hall/CRC, 2009. [Google Scholar]
- L. Anselin, and R. Florax. New Directions in Spatial Econometrics. Berlin, Germany: Springer, 2011. [Google Scholar]
- M. Abreu, H.L.F. de Groot, and R.J.G.M. Florax. “Space and growth: A survey of empirical evidence and methods.” Région et Développement 21 (2005): 12–43. [Google Scholar] [CrossRef]
- S.N. Durlauf, A. Kourtellos, and A. Minkin. “The local Solow growth model.” Eur. Econ. Rev. 45 (2001): 928–940. [Google Scholar] [CrossRef]
- T.P. Mamuneas, A. Savvides, and T. Stengos. “Economic development and the return to human capital: A smooth coefficient semiparametric approach.” J. Appl. Econom. 21 (2006): 111–132. [Google Scholar] [CrossRef]
- A. Kourtellos. “Modeling parameter heterogeneity in cross-country regression models.” In Economic Growth and Development (Frontiers of Economics and Globalization). Edited by O. de la Grandville. Bingley, UK: Emerald Group Publishing Limited, 2011, pp. 367–387. [Google Scholar]
- J.B. De Long, and L. H. Summers. “Equipment investment and economic growth.” Q. J. Econ. 106 (1991): 445–502. [Google Scholar] [CrossRef]
- R. Moreno, and B. Trehan. “Location and the growth of nations.” J. Econ. Growth 2 (1997): 399–418. [Google Scholar] [CrossRef]
- H.B. Chua. “On Spillovers and Convergence.” Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1993. [Google Scholar]
- A. Ades, and H.B. Chua. “Thy neighbour’s curse: Regional instability and economic growth.” J. Econ. Growth 2 (1997): 279–304. [Google Scholar] [CrossRef]
- R.J. Barro, and X. Sala-i-Martin. Economic Growth. New York, NY, USA: McGraw-Hill, 1995. [Google Scholar]
- C. Ertur, J. le Gallo, and C. Baumont. “The European regional convergence process, 1980–1995: Do spatial regimes and spatial dependence matter? ” Int. Reg. Sci. Rev. 29 (2006): 3–34. [Google Scholar] [CrossRef]
- R. Basile. “Regional economic growth in Europe: A semiparametric spatial dependence approach.” Pap. Reg. Sci. 87 (2008): 527–544. [Google Scholar] [CrossRef]
- J.C. Murdoch, M. Rahmatian, and M.A. Thayer. “A spatially autoregressive median voter model of recreation expenditures.” Public Financ. Q. 21 (1993): 334–350. [Google Scholar] [CrossRef]
- J.P. LeSage. “What regional scientists need to know about spatial econometrics.” Rev. Reg. Stud. 44 (2014): 13–32. [Google Scholar] [CrossRef]
- J.P. LeSage, and R.K. Pace. “The biggest myth in spatial econometrics.” Econometrics 2 (2014): 217–249. [Google Scholar] [CrossRef]
- A.C. Case, and H.S. Rosen. “Budget spillovers and fiscal policy interdependence.” J. Public Econ. 52 (1993): 285–307. [Google Scholar] [CrossRef]
- J. Pinkse, M.E. Slade, and C. Brett. “Spatial price competition: A semiparametric approach.” Econometrica 70 (2002): 1111–1153. [Google Scholar] [CrossRef]
- T.G. Conley, and E. Ligon. “Economic distance and cross-country spillovers.” J. Econ. Growth 7 (2002): 157–187. [Google Scholar] [CrossRef]
- L.F. Lee. “Identification and estimation of econometrics models with group interactions, contextual factors and fixed effects.” J. Econom. 140 (2007): 333–374. [Google Scholar] [CrossRef]
- Y. Sun. Functional-Coefficient Spatial Autoregressive Models with Nonparametric Spatial Weights. Working Paper; Guelph, ON, Canada: University of Guelph, 2015. [Google Scholar]
- A. Ahrens, and A. Bhattacharjee. “Two-step Lasso estimation of the spatial weights matrix.” Econometrics 3 (2015): 128–155. [Google Scholar] [CrossRef]
- R. Basile, M. Durbán, R. Mínguez, J.M. Montero, and J. Mur. “Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities.” J. Econ. Dyn. Control 48 (2014): 229–245. [Google Scholar] [CrossRef]
- G.A.F. Seber. A Matrix Handbook for Statisticians. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2008. [Google Scholar]
- L. Su. “Semiparametric GMM estimation of spatial autoregressive models.” J. Econom. 167 (2012): 543–560. [Google Scholar] [CrossRef]
- H.H. Kelejian, and G. Piras. “Estimation of spatial models with endogenous weighting matrices, and an application to a demand model for cigarettes.” Reg. Sci. Urban Econ. 46 (2014): 140–149. [Google Scholar] [CrossRef]
- X. Qu, and L.F. Lee. “Estimating a spatial autoregressive model with an endogenous spatial weight matrix.” J. Econom. 184 (2015): 209–232. [Google Scholar] [CrossRef]
- K. Ord. “Estimation methods for models of spatial interaction.” J. Am. Stat. Assoc. 70 (1975): 120–126. [Google Scholar] [CrossRef]
- L.F. Lee. “Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models.” Econometrica 72 (2004): 1899–1925. [Google Scholar] [CrossRef]
- Y. Bao, and A. Ullah. “Finite sample properties of maximum likelihood estimator in spatial models.” J. Econom. 137 (2007): 396–413. [Google Scholar] [CrossRef]
- H.H. Kelejian, and I.R. Prucha. “A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances.” J. Real Estate Financ. Econ. 17 (1998): 99–121. [Google Scholar] [CrossRef]
- L.F. Lee. “Best spatial two-stage least squares estimators for a spatial autoregressive model with autoregressive disturbances.” Econom. Rev. 22 (2003): 307–335. [Google Scholar] [CrossRef]
- X. Chen. “Large sample sieve estimation of semi-nonparametric models.” In Handbook of Econometrics 6B. Edited by J.J. Heckman and E.E. Leamer. New York, NY, USA: Springer, 2007, pp. 5549–5632. [Google Scholar]
- Q. Li, and J.S. Racine. Nonparametric Econometrics. Princeton, NJ, USA: Princeton University Press, 2007. [Google Scholar]
- L. Anselin, and A. Bera. “Spatial dependence in linear regression models with an introduction to spatial econometrics.” In Handbook of Applied Economic Statistics. Edited by A. Ullah and D.E.A. Giles. New York, NY, USA: Marcel Dekker, 1998, pp. 237–289. [Google Scholar]
- L.F. Lee. “GMM and 2SLS estimation of mixed regressive, spatial autoregressive models.” J. Econom. 137 (2007): 489–514. [Google Scholar] [CrossRef]
- D.W. Andrews. “Asymptotic normality of series estimators for nonparametric and semiparametric regression models.” Econometrica 59 (1991): 307–345. [Google Scholar] [CrossRef]
- W.K. Newey. “Convergence rates and asymptotic normality for series estimators.” J. Econom. 79 (1997): 147–168. [Google Scholar] [CrossRef]
- J. Fan, and I. Gijbels. Local Polynomial Modelling and Its Applications. New York, NY, USA: Chapman and Hall/CRC, 1996. [Google Scholar]
- B. Efron, and R.J. Tibshirani. An Introduction to the Bootstrap. New York, NY, USA: Chapman & Hall, 1993. [Google Scholar]
- W. Härdle, and J.S. Marron. “Bootstrap simultaneous error bars for nonparametric regression.” Ann. Stat. 19 (1991): 778–796. [Google Scholar] [CrossRef]
- Y. Sun. “A consistent nonparametric equality test of conditional quantile functions.” Econom. Theory 22 (2006): 614–632. [Google Scholar] [CrossRef]
- A. Bhattacharjee, and S. Holly. “Structural interactions in spatial panels.” Empir. Econom. 40 (2011): 69–94. [Google Scholar] [CrossRef]
- A. Bhattacharjee, and S. Holly. “Understanding interactions in social networks and committees.” Spat. Econ. Anal. 8 (2013): 23–53. [Google Scholar] [CrossRef]

^{2.}Since the sample size is less than 100, we include only one spatially-lagged exogenous variable, ${M}_{n}ln\left(y60\right)$, to have better finite sample estimation accuracy. Moreover, the reason behind this choice is that spatial lag effects from the savings rate and the population growth rate were not found significant in Ertur and Koch [4]’s Table IV.^{3.}We follow [4] in calculating the variable ${Z}_{ij}$:$${Z}_{ij}=radius*arccos[cos\left(\left|lon{g}_{i}-lon{g}_{j}\right|\right)cos\left(la{t}_{i}\right)cos\left(la{t}_{j}\right)+sin\left(la{t}_{i}\right)sin\left(la{t}_{j}\right)],$$^{4.}As it is stated by LeSage and Pace [10] (p. 37), average total impact can be expressed in two different ways, however, which give the same numerical results. The first viewpoint states an influence from a change in the initial real GDP per capita of an economy on all of the regions, while the second viewpoint states an impact of changes in the initial real GDP per capita of the entire economy on a region/nation.

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license ( http://creativecommons.org/licenses/by/4.0/).